Part-I: Write True if the statement is right, or False if it is wrong

1. Multiple Linear Regression Models (MLRM’s) allow us to use partial analysis.
2. If the error term is normally distributed, the dependent variable is also.
3. The alternative name for the dependent variable is control variable.
4. In MLRM’s, it is best to include explanatory variables(X’s) that are correlated to
dependent variable, but not (too much) correlated to the other X’s.
5. Too many variables imply over specified models that tend to be less precise.
6. The null hypotheses of the F-test imply the restricted model is as good in explaining
the variation in dependent variable as the full model.
7. If the F calculated is large (p < 0.05), the model is insignificant.
8. For qualitative variables, we have to include the reference category as an explanatory
variable into our model.
9. Perfect multicollinearity means that two or more independent variables have a fixed
mathematical relationship to each other.
10. The probability of success or failure for probit model is greater than that of the logit
model.
11. If variance inflation factor is greater than 10, which implies that your explanatory
variables are suspected with perfect multicollinearity.

1 answer

Here are the evaluations of each statement:

  1. True - Multiple Linear Regression Models (MLRMs) allow us to use partial analysis, which helps in understanding the contribution of individual explanatory variables.

  2. False - The normal distribution of the error term does not imply that the dependent variable is normally distributed; it only indicates that the residuals of the model are normally distributed.

  3. False - The alternative name for the dependent variable is not a control variable; per the terminology used in regression analysis, the dependent variable is the outcome variable, while control variables are other explanatory variables included to account for their effect.

  4. True - In MLRMs, it is indeed best to include explanatory variables that are correlated with the dependent variable but not too highly correlated with other X's, to avoid multicollinearity.

  5. True - Including too many variables can lead to overfitting, which may reduce the model's predictive power and decrease precision.

  6. True - The null hypothesis of the F-test states that the restricted model (with fewer predictors) is as good as the full model in explaining the variation in the dependent variable.

  7. False - If the F calculated is large (p < 0.05), it indicates that the model is significant, not insignificant.

  8. False - For qualitative (categorical) variables, it is common practice to omit the reference category from the model to avoid "dummy variable trap" and to have a baseline for comparison.

  9. True - Perfect multicollinearity means that two or more independent variables have a fixed mathematical relationship with each other (they can be expressed as linear combinations).

  10. False - The probability of success or failure in the probit and logit models can be conceptually similar, but they are based on different distribution assumptions (normal for probit and logistic for logit). One is not inherently greater than the other; they just model the same thing in different ways.

  11. False - A variance inflation factor (VIF) greater than 10 does indicate multicollinearity, but it does not imply perfect multicollinearity. It suggests that the explanatory variables are correlated, but perfect multicollinearity means the correlation is perfect (VIF would be infinitely large in such cases).