column_categorical_with_vocabulary_file {tfestimators} | R Documentation |
Construct a Categorical Column with a Vocabulary File
Description
Use this when your inputs are in string or integer format, and you have a
vocabulary file that maps each value to an integer ID. By default,
out-of-vocabulary values are ignored. Use either (but not both) of
num_oov_buckets
and default_value
to specify how to include
out-of-vocabulary values. For input dictionary features
, features[key]
is
either tensor or sparse tensor object. If it's tensor object, missing values can be
represented by -1
for int and ''
for string. Note that these values are
independent of the default_value
argument.
Usage
column_categorical_with_vocabulary_file(
...,
vocabulary_file,
vocabulary_size,
num_oov_buckets = 0L,
default_value = NULL,
dtype = tf$string
)
Arguments
... |
Expression(s) identifying input feature(s). Used as the column name and the dictionary key for feature parsing configs, feature tensors, and feature columns. |
vocabulary_file |
The vocabulary file name. |
vocabulary_size |
Number of the elements in the vocabulary. This must be
no greater than length of |
num_oov_buckets |
Non-negative integer, the number of out-of-vocabulary
buckets. All out-of-vocabulary inputs will be assigned IDs in the range
|
default_value |
The integer ID value to return for out-of-vocabulary
feature values, defaults to |
dtype |
The type of features. Only string and integer types are supported. |
Value
A categorical column with a vocabulary file.
Raises
ValueError:
vocabulary_file
is missing.ValueError:
vocabulary_size
is missing or < 1.ValueError:
num_oov_buckets
is not a non-negative integer.ValueError:
dtype
is neither string nor integer.
See Also
Other feature column constructors:
column_bucketized()
,
column_categorical_weighted()
,
column_categorical_with_hash_bucket()
,
column_categorical_with_identity()
,
column_categorical_with_vocabulary_list()
,
column_crossed()
,
column_embedding()
,
column_numeric()
,
input_layer()