create_funnel_model {aifeducation} | R Documentation |
Function for creating a new transformer based on Funnel Transformer
Description
This function creates a transformer configuration based on the Funnel Transformer base architecture and a vocabulary based on WordPiece by using the python libraries 'transformers' and 'tokenizers'.
Usage
create_funnel_model(
ml_framework = aifeducation_config$get_framework(),
model_dir,
vocab_raw_texts = NULL,
vocab_size = 30522,
vocab_do_lower_case = FALSE,
max_position_embeddings = 512,
hidden_size = 768,
target_hidden_size = 64,
block_sizes = c(4, 4, 4),
num_attention_heads = 12,
intermediate_size = 3072,
num_decoder_layers = 2,
pooling_type = "mean",
hidden_act = "gelu",
hidden_dropout_prob = 0.1,
attention_probs_dropout_prob = 0.1,
activation_dropout = 0,
sustain_track = TRUE,
sustain_iso_code = NULL,
sustain_region = NULL,
sustain_interval = 15,
trace = TRUE,
pytorch_safetensors = TRUE
)
Arguments
ml_framework |
|
model_dir |
|
vocab_raw_texts |
|
vocab_size |
|
vocab_do_lower_case |
|
max_position_embeddings |
|
| |
| |
block_sizes |
|
num_attention_heads |
|
intermediate_size |
|
num_decoder_layers |
|
pooling_type |
|
| |
| |
attention_probs_dropout_prob |
|
activation_dropout |
|
sustain_track |
|
sustain_iso_code |
|
sustain_region |
Region within a country. Only available for USA and Canada See the documentation of codecarbon for more information. https://mlco2.github.io/codecarbon/parameters.html |
sustain_interval |
|
trace |
|
pytorch_safetensors |
|
Value
This function does not return an object. Instead the configuration and the vocabulary of the new model are saved on disk.
Note
The model uses a configuration with truncate_seq=TRUE
to avoid
implementation problems with tensorflow.
To train the model, pass the directory of the model to the function train_tune_funnel_model.
Model is created with separete_cls=TRUE
,truncate_seq=TRUE
, and
pool_q_only=TRUE
.
This models uses a WordPiece Tokenizer like BERT and can be trained with whole word masking. Transformer library may show a warning which can be ignored.
References
Dai, Z., Lai, G., Yang, Y. & Le, Q. V. (2020). Funnel-Transformer: Filtering out Sequential Redundancy for Efficient Language Processing. doi:10.48550/arXiv.2006.03236
Hugging Face documentation https://huggingface.co/docs/transformers/model_doc/funnel#funnel-transformer
See Also
Other Transformer:
create_bert_model()
,
create_deberta_v2_model()
,
create_longformer_model()
,
create_roberta_model()
,
train_tune_bert_model()
,
train_tune_deberta_v2_model()
,
train_tune_funnel_model()
,
train_tune_longformer_model()
,
train_tune_roberta_model()