R: <Deprecated> Fill in the blank mask(s) in a query (sentence).

text_unmask {PsychWordVec}

R Documentation

<Deprecated> Fill in the blank mask(s) in a query (sentence).

Description

Note: This function has been deprecated and will not be updated since I have developed new package FMAT as the integrative toolbox of Fill-Mask Association Test (FMAT).

Predict the probably correct masked token(s) in a sequence, based on the Python module transformers.

Usage

text_unmask(query, model, targets = NULL, topn = 5)

Arguments

`query`	A query (sentence/prompt) with masked token(s) `[MASK]`. Multiple queries are also supported. See examples.
`model`	Model name at HuggingFace. See `text_model_download`. If the model has not been downloaded, it would automatically download the model.
`targets`	Specific target word(s) to be filled in the blank `[MASK]`. Defaults to `NULL` (i.e., return `topn`). If specified, then `topn` will be ignored (see examples).
`topn`	Number of the most likely predictions to return. Defaults to `5`. If `targets` is specified, then it will automatically change to the length of `targets`.

Details

Masked language modeling is the task of masking some of the words in a sentence and predicting which words should replace those masks. These models are useful when we want to get a statistical understanding of the language in which the model is trained in. See https://huggingface.co/tasks/fill-mask for details.

Value

A data.table of query results:

query_id (if there are more than one query): query ID (indicating multiple queries)
mask_id (if there are more than one [MASK] in query): [MASK] ID (position in sequence, indicating multiple masks)
prob: Probability of the predicted token in the sequence
token_id: Predicted token ID (to replace [MASK])
token: Predicted token (to replace [MASK])
sequence: Complete sentence with the predicted token

Examples

## Not run: 
# text_init()  # initialize the environment

model = "distilbert-base-cased"

text_unmask("Beijing is the [MASK] of China.", model)

# multiple [MASK]s:
text_unmask("Beijing is the [MASK] [MASK] of China.", model)

# multiple queries:
text_unmask(c("The man worked as a [MASK].",
              "The woman worked as a [MASK]."),
            model)

# specific targets:
text_unmask("The [MASK] worked as a nurse.", model,
            targets=c("man", "woman"))

## End(Not run)

[Package PsychWordVec version 2023.9 Index]