step_embedding_column {tfdatasets} | R Documentation |
Creates embeddings columns
Description
Use this step to create ambeddings columns from categorical columns.
Usage
step_embedding_column(
spec,
...,
dimension = function(x) {
as.integer(x^0.25)
},
combiner = "mean",
initializer = NULL,
ckpt_to_load_from = NULL,
tensor_name_in_ckpt = NULL,
max_norm = NULL,
trainable = TRUE
)
Arguments
spec |
A feature specification created with |
... |
Comma separated list of variable names to apply the step. selectors can also be used. |
dimension |
An integer specifying dimension of the embedding, must be > 0. Can also be a function of the size of the vocabulary. |
combiner |
A string specifying how to reduce if there are multiple entries in
a single row. Currently 'mean', 'sqrtn' and 'sum' are supported, with 'mean' the
default. 'sqrtn' often achieves good accuracy, in particular with bag-of-words
columns. Each of this can be thought as example level normalizations on
the column. For more information, see |
initializer |
A variable initializer function to be used in embedding
variable initialization. If not specified, defaults to
|
ckpt_to_load_from |
String representing checkpoint name/pattern from
which to restore column weights. Required if |
tensor_name_in_ckpt |
Name of the Tensor in ckpt_to_load_from from which to
restore the column weights. Required if |
max_norm |
If not |
trainable |
Whether or not the embedding is trainable. Default is |
Value
a FeatureSpec
object.
See Also
steps for a complete list of allowed steps.
Other Feature Spec Functions:
dataset_use_spec()
,
feature_spec()
,
fit.FeatureSpec()
,
step_bucketized_column()
,
step_categorical_column_with_hash_bucket()
,
step_categorical_column_with_identity()
,
step_categorical_column_with_vocabulary_file()
,
step_categorical_column_with_vocabulary_list()
,
step_crossed_column()
,
step_indicator_column()
,
step_numeric_column()
,
step_remove_column()
,
step_shared_embeddings_column()
,
steps
Examples
## Not run:
library(tfdatasets)
data(hearts)
file <- tempfile()
writeLines(unique(hearts$thal), file)
hearts <- tensor_slices_dataset(hearts) %>% dataset_batch(32)
# use the formula interface
spec <- feature_spec(hearts, target ~ thal) %>%
step_categorical_column_with_vocabulary_list(thal) %>%
step_embedding_column(thal, dimension = 3)
spec_fit <- fit(spec)
final_dataset <- hearts %>% dataset_use_spec(spec_fit)
## End(Not run)