train_tune_deberta_v2_model {aifeducation} | R Documentation |
Function for training and fine-tuning a DeBERTa-V2 model
Description
This function can be used to train or fine-tune a transformer based on DeBERTa-V2 architecture with the help of the python libraries 'transformers', 'datasets', and 'tokenizers'.
Usage
train_tune_deberta_v2_model(
ml_framework = aifeducation_config$get_framework(),
output_dir,
model_dir_path,
raw_texts,
p_mask = 0.15,
whole_word = TRUE,
val_size = 0.1,
n_epoch = 1,
batch_size = 12,
chunk_size = 250,
full_sequences_only = FALSE,
min_seq_len = 50,
learning_rate = 0.03,
n_workers = 1,
multi_process = FALSE,
sustain_track = TRUE,
sustain_iso_code = NULL,
sustain_region = NULL,
sustain_interval = 15,
trace = TRUE,
keras_trace = 1,
pytorch_trace = 1,
pytorch_safetensors = TRUE
)
Arguments
ml_framework |
|
output_dir |
|
model_dir_path |
|
raw_texts |
|
p_mask |
|
whole_word |
|
val_size |
|
n_epoch |
|
batch_size |
|
chunk_size |
|
full_sequences_only |
|
min_seq_len |
|
learning_rate |
|
n_workers |
|
multi_process |
|
sustain_track |
|
sustain_iso_code |
|
sustain_region |
Region within a country. Only available for USA and Canada See the documentation of codecarbon for more information. https://mlco2.github.io/codecarbon/parameters.html |
sustain_interval |
|
trace |
|
keras_trace |
|
pytorch_trace |
|
pytorch_safetensors |
|
Value
This function does not return an object. Instead the trained or fine-tuned model is saved to disk.
Note
Pre-Trained models which can be fine-tuned with this function are available at https://huggingface.co/. New models can be created via the function create_deberta_v2_model.
Training of this model makes use of dynamic masking.
References
He, P., Liu, X., Gao, J. & Chen, W. (2020). DeBERTa: Decoding-enhanced BERT with Disentangled Attention. doi:10.48550/arXiv.2006.03654
Hugging Face Documentation https://huggingface.co/docs/transformers/model_doc/deberta-v2#debertav2
See Also
Other Transformer:
create_bert_model()
,
create_deberta_v2_model()
,
create_funnel_model()
,
create_longformer_model()
,
create_roberta_model()
,
train_tune_bert_model()
,
train_tune_funnel_model()
,
train_tune_longformer_model()
,
train_tune_roberta_model()