The main goal of penalty is to minimize the sum of the unpenalized cost function plus the penalty term, which can be understood as adding bias and preferring a simpler model to reduce the variance in the absence of sufficient training data to fit the model.
- The ridge penalty shrinks the coefficients of correlated predictors towards each other (L2)
- The lasso tends to pick one of them and discard the others (L1)
- The elastic-net penalty mixes these two.
If predictors are correlated in groups, an
is for numerical stability; for example, the elastic net with for some small performs much like the lasso, but removes any degeneracies and wild behavior caused by extreme correlations.
## http://web.stanford.edu/~hastie/glmnet/glmnet_alpha.html#log
install.packages("glmnet", repos = "http://cran.us.r-project.org")
require(glmnet)
load("BinomialExample.RData")
fit = glmnet(x, y, family = "binomial")
summary(fit)
plot(fit, xvar = "dev", label = TRUE)
predict(fit, newx = x[1:5,], type = "class", s = c(0.05, 0.01))
cvfit = cv.glmnet(x, y, family = "binomial", type.measure = "class")
plot(cvfit)
cvfit$lambda.min
cvfit$lambda.1se
coef(cvfit, s = "lambda.min")
predict(cvfit, newx = x[1:10,], s = "lambda.min", type = "class")
No comments:
Post a Comment