train_tune_bert_model {aifeducation} | R Documentation |
Function for training and fine-tuning a BERT model
Description
This function can be used to train or fine-tune a transformer based on BERT architecture with the help of the python libraries 'transformers', 'datasets', and 'tokenizers'.
Usage
train_tune_bert_model(
ml_framework = aifeducation_config$get_framework(),
output_dir,
model_dir_path,
raw_texts,
p_mask = 0.15,
whole_word = TRUE,
val_size = 0.1,
n_epoch = 1,
batch_size = 12,
chunk_size = 250,
full_sequences_only = FALSE,
min_seq_len = 50,
learning_rate = 0.003,
n_workers = 1,
multi_process = FALSE,
sustain_track = TRUE,
sustain_iso_code = NULL,
sustain_region = NULL,
sustain_interval = 15,
trace = TRUE,
keras_trace = 1,
pytorch_trace = 1,
pytorch_safetensors = TRUE
)
Arguments
ml_framework |
|
output_dir |
|
model_dir_path |
|
raw_texts |
|
p_mask |
|
whole_word |
|
val_size |
|
n_epoch |
|
batch_size |
|
chunk_size |
|
full_sequences_only |
|
min_seq_len |
|
learning_rate |
|
n_workers |
|
multi_process |
|
sustain_track |
|
sustain_iso_code |
|
sustain_region |
Region within a country. Only available for USA and Canada See the documentation of codecarbon for more information. https://mlco2.github.io/codecarbon/parameters.html |
sustain_interval |
|
trace |
|
keras_trace |
|
pytorch_trace |
|
pytorch_safetensors |
|
Value
This function does not return an object. Instead the trained or fine-tuned model is saved to disk.
Note
This models uses a WordPiece Tokenizer like BERT and can be trained with whole word masking. Transformer library may show a warning which can be ignored.
Pre-Trained models which can be fine-tuned with this function are available at https://huggingface.co/.
New models can be created via the function create_bert_model.
Training of the model makes use of dynamic masking in contrast to the original paper where static masking was applied.
References
Devlin, J., Chang, M.‑W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In J. Burstein, C. Doran, & T. Solorio (Eds.), Proceedings of the 2019 Conference of the North (pp. 4171–4186). Association for Computational Linguistics. doi:10.18653/v1/N19-1423
Hugging Face documentation https://huggingface.co/docs/transformers/model_doc/bert#transformers.TFBertForMaskedLM
See Also
Other Transformer:
create_bert_model()
,
create_deberta_v2_model()
,
create_funnel_model()
,
create_longformer_model()
,
create_roberta_model()
,
train_tune_deberta_v2_model()
,
train_tune_funnel_model()
,
train_tune_longformer_model()
,
train_tune_roberta_model()