| EmbeddedText {aifeducation} | R Documentation |
Embedded text
Description
Object of class R6 which stores the text embeddings
generated by an object of class TextEmbeddingModel via the method
embed().
Value
Returns an object of class EmbeddedText. These objects are used
for storing and managing the text embeddings created with objects of class TextEmbeddingModel.
Objects of class EmbeddedText serve as input for classifiers of class
TextEmbeddingClassifierNeuralNet. The main aim of this class is to provide a structured link between
embedding models and classifiers. Since objects of this class save information on
the text embedding model that created the text embedding it ensures that only
embedding generated with same embedding model are combined. Furthermore, the stored information allows
classifiers to check if embeddings of the correct text embedding model are used for
training and predicting.
Public fields
embeddings('data.frame()')
data.frame containing the text embeddings for all chunks. Documents are in the rows. Embedding dimensions are in the columns.
Methods
Public methods
Method new()
Creates a new object representing text embeddings.
Usage
EmbeddedText$new( model_name = NA, model_label = NA, model_date = NA, model_method = NA, model_version = NA, model_language = NA, param_seq_length = NA, param_chunks = NULL, param_overlap = NULL, param_emb_layer_min = NULL, param_emb_layer_max = NULL, param_emb_pool_type = NULL, param_aggregation = NULL, embeddings )
Arguments
model_namestringName of the model that generates this embedding.model_labelstringLabel of the model that generates this embedding.model_datestringDate when the embedding generating model was created.model_methodstringMethod of the underlying embedding model.model_versionstringVersion of the model that generated this embedding.model_languagestringLanguage of the model that generated this embedding.param_seq_lengthintMaximum number of tokens that processes the generating model for a chunk.param_chunksintMaximum number of chunks which are supported by the generating model.param_overlapintNumber of tokens that were added at the beginning of the sequence for the next chunk by this model.param_emb_layer_minintorstringdetermining the first layer to be included in the creation of embeddings.param_emb_layer_maxintorstringdetermining the last layer to be included in the creation of embeddings.param_emb_pool_typestringdetermining the method for pooling the token embeddings within each layer.param_aggregationstringAggregation method of the hidden states. Deprecated. Only included for backward compatibility.embeddingsdata.framecontaining the text embeddings.
Returns
Returns an object of class EmbeddedText which stores the text embeddings produced by an objects of class TextEmbeddingModel. The object serves as input for objects of class TextEmbeddingClassifierNeuralNet.
Method get_model_info()
Method for retrieving information about the model that generated this embedding.
Usage
EmbeddedText$get_model_info()
Returns
list contains all saved information about the underlying
text embedding model.
Method get_model_label()
Method for retrieving the label of the model that generated this embedding.
Usage
EmbeddedText$get_model_label()
Returns
string Label of the corresponding text embedding model
Method clone()
The objects of this class are cloneable with this method.
Usage
EmbeddedText$clone(deep = FALSE)
Arguments
deepWhether to make a deep clone.
See Also
Other Text Embedding:
TextEmbeddingModel,
combine_embeddings()