Deriving OLS Estimators for three variable regression model

Question

Deriving OLS Estimators for three variable regression model
To find the OLS estimators, let us first write the SRF corresponding to the PRF as 
follows:
 
̂ + 
̂ + 
̂ + 
̂ 
The OLS estimators are obtained where the squared sum of the residuals (RSS) from the 
estimation is as minimum as possible.

GPT-4o mini · Answer

To derive the Ordinary Least Squares (OLS) estimators for a three-variable regression model, we start by specifying a regression model in the following form:

\[ Y_i = \beta_0 + \beta_1 X_{1i} + \beta_2 X_{2i} + \beta_3 X_{3i} + \epsilon_i \]

where:

\(Y_i\) is the dependent variable,
\(X_{1i}\), \(X_{2i}\), and \(X_{3i}\) are independent variables,
\(\beta_0\) is the intercept,
\(\beta_1\), \(\beta_2\), and \(\beta_3\) are the coefficients of the independent variables, and
\(\epsilon_i\) is the error term or residual.

Step 1: Residuals

The estimated values \(\hat{Y}_i\) from the model can be described as:

\[ \hat{Y}_i = \hat{\beta}_0 + \hat{\beta}1 X{1i} + \hat{\beta}2 X{2i} + \hat{\beta}3 X{3i} \]

The residuals (errors) are defined as:

\[ \hat{\epsilon}_i = Y_i - \hat{Y}_i \]

Step 2: Residual Sum of Squares (RSS)

The goal of OLS is to minimize the Residual Sum of Squares (RSS), which is defined as:

\[ RSS = \sum_{i=1}^{n} (Y_i - \hat{Y}_i)^2 \]

Substituting \(\hat{Y}_i\):

\[ RSS = \sum_{i=1}^{n} (Y_i - (\beta_0 + \beta_1 X_{1i} + \beta_2 X_{2i} + \beta_3 X_{3i}))^2 \]

Step 3: Differentiating RSS

To find the minimum of the RSS, we take the partial derivatives of the RSS with respect to each parameter \(\beta_0\), \(\beta_1\), \(\beta_2\), and \(\beta_3\), set them equal to zero, and solve for the coefficients.

Partial derivative with respect to \(\beta_0\):

\[ \frac{\partial RSS}{\partial \beta_0} = -2 \sum_{i=1}^{n} (Y_i - \hat{Y}i) \] Setting this equal to zero gives \[ \sum{i=1}^{n} \hat{Y}i = \sum{i=1}^{n} Y_i \] This leads to the first normal equation.

Partial derivative with respect to \(\beta_1\):

\[ \frac{\partial RSS}{\partial \beta_1} = -2 \sum_{i=1}^{n} (Y_i - \hat{Y}i) X{1i} \] Setting this equal to zero gives \[ \sum_{i=1}^{n} \hat{Y}i X{1i} = \sum_{i=1}^{n} Y_i X_{1i} \] This leads to the second normal equation.

Partial derivative with respect to \(\beta_2\):

\[ \frac{\partial RSS}{\partial \beta_2} = -2 \sum_{i=1}^{n} (Y_i - \hat{Y}i) X{2i} \] Setting this equal to zero gives \[ \sum_{i=1}^{n} \hat{Y}i X{2i} = \sum_{i=1}^{n} Y_i X_{2i} \] This leads to the third normal equation.

Partial derivative with respect to \(\beta_3\):

\[ \frac{\partial RSS}{\partial \beta_3} = -2 \sum_{i=1}^{n} (Y_i - \hat{Y}i) X{3i} \] Setting this equal to zero gives \[ \sum_{i=1}^{n} \hat{Y}i X{3i} = \sum_{i=1}^{n} Y_i X_{3i} \] This leads to the fourth normal equation.

Step 4: Matrix Formulation

Since we have a system of simultaneous equations, it can be expressed in matrix form as:

\[ \mathbf{Y} = \mathbf{X} \mathbf{\beta} + \epsilon \]

Where:

\(\mathbf{Y}\) is an \(n \times 1\) vector of observations,
\(\mathbf{X}\) is an \(n \times (k + 1)\) matrix of predictors (including a column of ones for the intercept),
\(\mathbf{\beta}\) is a \((k + 1) \times 1\) vector of coefficients,
\(\epsilon\) is an \(n \times 1\) vector of errors.

The OLS estimator \(\hat{\beta}\) can then be derived as:

\[ \hat{\beta} = (\mathbf{X}'\mathbf{X})^{-1} \mathbf{X}'\mathbf{Y} \]

Where \(\mathbf{X}'\) denotes the transpose of \(\mathbf{X}\).

Conclusion

The OLS estimators \(\hat{\beta} = [\hat{\beta}_0, \hat{\beta}_1, \hat{\beta}_2, \hat{\beta}_3]'\) are obtained by minimizing the residual sum of squares, which provides the best linear unbiased estimates of the coefficients in the model.