weblmCalculateConditionalProbability {mscsweblm4r} | R Documentation |
Calculates the conditional probability that a word follows a sequence of words.
Description
This function calculates the conditional probability that a particular word will follow a given sequence of words. The input string must be in ASCII format.
Internally, this function invokes the Microsoft Cognitive Services Web Language Model REST API documented at https://www.microsoft.com/cognitive-services/en-us/web-language-model-api/documentation.
You MUST have a valid Microsoft Cognitive Services account and an API key for this function to work properly. See https://www.microsoft.com/cognitive-services/en-us/pricing for details.
Usage
weblmCalculateConditionalProbability(precedingWords, continuations,
modelToUse = "body", orderOfNgram = 5L)
Arguments
precedingWords |
(character) Character string for which to calculate continuation probabilities. Must be in ASCII format. |
continuations |
(character vector) Vector of words following
|
modelToUse |
(character) Which language model to use, supported values: "title", "anchor", "query", or "body" (optional, default: "body") |
orderOfNgram |
(integer) Which order of N-gram to use, supported values: 1L, 2L, 3L, 4L, or 5L (optional, default: 5L) |
Value
An S3 object of the class weblm
. The results are stored in
the results
dataframe inside this object. The dataframe contains the
continuation words and their log(probability).
Author(s)
Phil Ferriere pferriere@hotmail.com
Examples
## Not run:
tryCatch({
# Calculate conditional probability a particular word will follow a given sequence of words
conditionalProbabilities <- weblmCalculateConditionalProbability(
precedingWords = "hello world wide", # ASCII only
continuations = c("web", "range", "open"), # ASCII only
modelToUse = "title", # "title"|"anchor"|"query"(default)|"body"
orderOfNgram = 4L # 1L|2L|3L|4L|5L(default)
)
# Class and structure of conditionalProbabilities
class(conditionalProbabilities)
#> [1] "weblm"
str(conditionalProbabilities, max.level = 1)
#> List of 3
#> $ results:'data.frame': 3 obs. of 3 variables:
#> $ json : chr "{"results":[{"words":"hello world wide","word":"web", __truncated__ }]}
#> $ request:List of 7
#> ..- attr(*, "class")= chr "request"
#> - attr(*, "class")= chr "weblm"
# Print results
pandoc.table(conditionalProbabilities$results)
#> -------------------------------------
#> words word probability
#> ---------------- ------ -------------
#> hello world wide web -0.32
#>
#> hello world wide range -2.403
#>
#> hello world wide open -2.97
#> -------------------------------------
}, error = function(err) {
# Print error
geterrmessage()
})
## End(Not run)