Suppose that you want to calculate the VIF of variable X1 in the following model:

Yi =β0 + β1 X1i + β2 X2i + β3 X3i + ‎ε

and remember that the equation for calculating the VIF is the following: VIFj=1/(1-Rj2)

How is the Rj2 that is used for getting the VIF calculated?

1 answer

The Rj2 used for calculating the VIF is obtained by regressing the predictor variable (Xj) of interest on all the other predictor variables (X1, X2, X3, etc.) in the model.

Specifically, the Rj2 is the coefficient of determination (also known as the squared multiple correlation coefficient) of the regression of Xj on all other predictors. It represents the proportion of variance in Xj that is explained by the other predictor variables in the model.

To calculate the Rj2, you would perform the following steps:

1. Regress Xj on all other predictors (X1, X2, X3, etc.) in the model by estimating the coefficients β1, β2, β3, and so on. The resulting regression equation would be of the form: Xj = γ0 + γ1X1 + γ2X2 + γ3X3 + ... + γnXn + εj, where γ0, γ1, γ2, γ3, and so on are the estimated coefficients.
2. Calculate the residual sum of squares (RSS) for this regression, which represents the unexplained variation in Xj after accounting for the other predictors.
3. Calculate the total sum of squares (TSS) for Xj, which represents the total variation in Xj.
4. Compute the coefficient of determination (Rj2) by taking the ratio of the explained variation (TSS - RSS) to the total variation: Rj2 = (TSS - RSS) / TSS.

The resulting Rj2 value is then used to calculate the VIF for variable Xj using the formula VIFj = 1 / (1 - Rj2).