h2o.tf_idf {h2o}R Documentation

Computes TF-IDF values for each word in given documents.

Description

Computes TF-IDF values for each word in given documents.

Usage

h2o.tf_idf(
  frame,
  document_id_col,
  text_col,
  preprocess = TRUE,
  case_sensitive = TRUE
)

Arguments

frame

documents or words frame for which TF-IDF values should be computed.

document_id_col

index or name of a column containing document IDs.

text_col

index or name of a column containing documents if 'preprocess = TRUE' or words if 'preprocess = FALSE'.

preprocess

whether input text data should be pre-processed. Defaults to 'TRUE'.

case_sensitive

whether input data should be treated as case sensitive. Defaults to 'TRUE'.

Value

resulting frame with TF-IDF values. Row format: documentID, word, TF, IDF, TF-IDF


[Package h2o version 3.44.0.3 Index]