R: Train word embeddings to a numeric (ridge regression) or...

textTrain {text}

R Documentation

Train word embeddings to a numeric (ridge regression) or categorical (random forest) variable.

Description

Train word embeddings to a numeric (ridge regression) or categorical (random forest) variable.

Usage

textTrain(x, y, force_train_method = "automatic", ...)

Arguments

`x`	Word embeddings from textEmbed (or textEmbedLayerAggreation). Can analyze several variables at the same time; but if training to several outcomes at the same time use a tibble within the list as input rather than just a tibble input (i.e., keep the name of the wordembedding).
`y`	Numeric variable to predict. Can be several; although then make sure to have them within a tibble (this is required even if it is only one outcome but several word embeddings variables).
`force_train_method`	Default is "automatic", so if y is a factor random_forest is used, and if y is numeric ridge regression is used. This can be overridden using "regression" or "random_forest".
`...`	Arguments from textTrainRegression or textTrainRandomForest the textTrain function.

Value

A correlation between predicted and observed values; as well as a tibble of predicted values (t-value, degree of freedom (df), p-value, alternative-hypothesis, confidence interval, correlation coefficient).

Examples

# Examines how well the embeddings from "harmonytext" can
# predict the numeric variable "hilstotal" in the pre-included
# dataset "Language_based_assessment_data_8".

## Not run: 
trained_model <- textTrain(
  x = word_embeddings_4$texts$harmonytext,
  y = Language_based_assessment_data_8$hilstotal
)

# Examine results (t-value, degree of freedom (df), p-value,
# alternative-hypothesis, confidence interval, correlation coefficient).

trained_model$results

## End(Not run)

[Package text version 1.2.3 Index]

Train word embeddings to a numeric (ridge regression) or categorical (random forest) variable.

Description

Usage

Arguments

Value

See Also

Examples