Thursday, July 5, 2012

Collinearity and VIF


Collinearity occurs when two or more variables are highly associated. Including them in a linear model can result in confusing, nonsensical, or misleading results, because the model cannot differentiate the contribution from each of them.

Because visits and transactions are so highly related, and also because a linear model assumes that effects are additive, an effect attributed to one variable (such as transactions) is not available in the model to be attributed jointly to another that is highly correlated (visits). This will cause the standard errors to he predictors to increase, which means that the coefficient estimates will be highly uncertain or unstable. As a practical consequence, this may cause coefficient estimates to differ dramatically from sample to sample due to minor variation in the data even when underlying relationships are the same.

The degree of collinearity in data can be assessed as the variance inflation factor(VIF). This estimates how much the standard error (variance) of a coefficient in the linear model is increased because of shared variance with other variables, compared to the situation if the variables were un-correlated or simple single predictor regression were performed.

The VIF provides a measure of shared variance among variables in a model. A common rule of thumb is that VIF > 5.0 indicates the need to mitigate collinearity.

There are three general strategies for mitigating collinearity:
- Omit variables that are highly correlated.
- Eliminate correlation by extracting principal components or factors for sets of highly correlated predictors.
- Use a method that is robust to collinearity, i.e., something other than traditional linear modeling, e.g, random forest, which only uses a subset of variables at a time. Or, use PCA to extract the first component from the variables.
In all, common approaches to fixing collinearity include omitting highly correlated variables, and using principle components or factor scores instead of individual items.







No comments:

Post a Comment