The false statement is:
There is an infinite family of solutions (a,b) that minimize the squared mean error, E[(Y - a - bX)^2].
When Var(X) = 0, it means that X is a constant value. In this case, Y is completely determined by this constant value of X, and there is no need for any further regression analysis. The relationship between X and Y is deterministic, and there is only one solution that minimizes the squared mean error.
Let us think about what goes wrong when we drop the assumption that \textsf{Var}(X) \neq 0 in theoretical linear regression.
Let X and Y be two real random variables with two moments, and \textsf{Var}(X) = 0. (Note: the variance of X is zero whenever \mathbf{P}(X = \mathbb E[X]) = 1.) We make no further assumptions on Y.
Which one of the following statements is false?
There is an infinite family of solutions (a,b) that minimize the squared mean error, \mathbb E[(Y - a - bX)^2].
There is no line y = a+bx that predicts Y given X with probability 1, regardless of their distribution.
With probability equal to 1, the random pair (X,Y) lies on the vertical line x = \mathbb E[X].
unanswered
1 answer