The correct answers are:
- 0
- (\mathbb {X}^ T \mathbb {X})^{-1} \mathbb {X}^ T \mathbb E[\mathbf Y]
- \mathbb {X}^ T \mathbb {X} \beta
Since we assume that the vector {\boldsymbol \varepsilon } is a random variable with mean \mathbb E[{\boldsymbol \varepsilon }] = 0, the LSE \hat{{\boldsymbol \beta }} is still a random variable. However, the mean of \hat{{\boldsymbol \beta }} is not provided in the information given. The correct options are those that describe the mean of \hat{{\boldsymbol \beta }}.
- The mean of \hat{{\boldsymbol \beta }} is not necessarily 0, so option 0 is correct.
- Since \mathbb E[{\boldsymbol \varepsilon }] = 0, we can express \mathbf Y as \mathbb Y = \mathbb {X} \beta + \mathbb E[\mathbf Y]. Therefore, the mean of \mathbf Y is \mathbb E[\mathbf Y] and multiplying it by (\mathbb {X}^ T \mathbb {X})^{-1} \mathbb {X}^ T would give the mean of \hat{{\boldsymbol \beta }}. Thus, option (\mathbb {X}^ T \mathbb {X})^{-1} \mathbb {X}^ T \mathbb E[\mathbf Y] is correct.
- \mathbb {X}^ T \mathbb {X} \beta is the deterministic part of the equation without any random variable involved. Therefore, option \mathbb {X}^ T \mathbb {X} \beta is correct.
- \beta is the true coefficient vector and not a random variable, so option \beta is incorrect.
- \epsilon represents the random noise, not the mean of the LSE, so option \epsilon is incorrect.
In the setting of deterministic design for linear regression, we assume that the design matrix \mathbb {X} is deterministic instead of random. The model still prescribes \mathbf Y= \mathbb {X} {\boldsymbol \beta }+ {\boldsymbol \varepsilon }, where {\boldsymbol \varepsilon }= (\varepsilon _1, \ldots , \varepsilon _ n) is a random vector that represents noise. Take note that the only random object on the right hand side is \mathbf\varepsilon, and that Y is still random.
For the rest of this section, we will always assume (\mathbb {X}^ T \mathbb {X})^{-1} exists; i.e. \mathrm{rank}(\mathbb {X}) = p.
Recall that the Least-Squares Estimator \hat{{\boldsymbol \beta }} has the formula
\hat{{\boldsymbol \beta }} = (\mathbb {X}^ T \mathbb {X})^{-1} \mathbb {X}^ T \mathbf Y.
If we assume that the vector {\boldsymbol \varepsilon } is a random variable with mean \mathbb E[{\boldsymbol \varepsilon }] = 0, then in the deterministic design setting: “The LSE \hat{{\boldsymbol \beta }} is a random variable, with mean..." (choose all that apply)
0
(\mathbb {X}^ T \mathbb {X})^{-1} \mathbb {X}^ T \mathbb E[\mathbf Y]
\mathbb {X}^ T \mathbb {X} \beta
\beta
\epsilon
1 answer