select_features {MetaNLP} | R Documentation |
Select features via elasticnet regularization
Description
As the word count matrix quickly grows with an increasing number of abstracts, it can easily reach several thousand columns. Thus, it can be important to extract the columns that carry most of the information in the decision making process. This function uses a generalized linear model combined with elasticnet regularization to extract these features. In contrast to a usual regression model or a L2 penalty (ridge regression), elasticnet (and LASSO) sets some regression parameters to 0. Thus, the selected features are exactly the features with a non-zero entry.
Usage
select_features(object, ...)
## S4 method for signature 'MetaNLP'
select_features(object, alpha = 0.8, lambda = "avg", seed = NULL, ...)
Arguments
object |
An object of class |
... |
Additional arguments for cv.glmnet. An important
option might be |
alpha |
The elastic net mixing parameter, with |
lambda |
The weight parameter of the penalty. The possible values are
|
seed |
A numeric value which is used as a local seed for this function.
Default is |
Details
The computational aspects are executed by the glmnet
package. At first, a model is fitted via glmnet. The
elastic net parameter \alpha
can be specified by the user. The
parameter \lambda
, which determines the weight of the penalty, can
either be chosen via cross validation (using cv.glmnet or by
giving a numeric value.
Value
An object of class MetaNLP
, where the columns were selected
via elastic net.
Note
By using a fix value for lambda
, the number of features which should
be selected can easily be adjusted by the parameter alpha
. The smaller
one chooses alpha
, the more columns will still be present in the
resulting data frame, the higher one chooses alpha
, the less
columns will be chosen.
Examples
path <- system.file("extdata", "test_data.csv", package = "MetaNLP", mustWork = TRUE)
obj <- MetaNLP(path)
obj2 <- select_features(obj, alpha = 0.7, lambda = "min")