weblmCalculateJointProbability {mscsweblm4r} | R Documentation |
Calculates the joint probability that a sequence of words will appear together.
Description
This function calculates the joint probability that a particular sequence of words will appear together. The input string must be in ASCII format.
Internally, this function invokes the Microsoft Cognitive Services Web Language Model REST API documented at https://www.microsoft.com/cognitive-services/en-us/web-language-model-api/documentation.
You MUST have a valid Microsoft Cognitive Services account and an API key for this function to work properly. See https://www.microsoft.com/cognitive-services/en-us/pricing for details.
Usage
weblmCalculateJointProbability(inputWords, modelToUse = "body",
orderOfNgram = 5L)
Arguments
inputWords |
(character vector) Vector of character strings for which to calculate the joint probability. Must be in ASCII format. |
modelToUse |
(character) Which language model to use, supported values: "title", "anchor", "query", or "body" (optional, default: "body") |
orderOfNgram |
(integer) Which order of N-gram to use, supported values: 1L, 2L, 3L, 4L, or 5L (optional, default: 5L) |
Value
An S3 object of the class weblm
. The results are stored in
the results
dataframe inside this object. The dataframe contains the
word sequences and their log(probability).
Author(s)
Phil Ferriere pferriere@hotmail.com
Examples
## Not run:
tryCatch({
# Calculate joint probability a particular sequence of words will appear together
jointProbabilities <- weblmCalculateJointProbability(
inputWords = c("where", "is", "San", "Francisco", "where is",
"San Francisco", "where is San Francisco"), # ASCII only
modelToUse = "query", # "title"|"anchor"|"query"(default)|"body"
orderOfNgram = 4L # 1L|2L|3L|4L|5L(default)
)
# Class and structure of jointProbabilities
class(jointProbabilities)
#> [1] "weblm"
str(jointProbabilities, max.level = 1)
#> List of 3
#> $ results:'data.frame': 7 obs. of 2 variables:
#> $ json : chr "{"results":[{"words":"where","probability":-3.378}, __truncated__ ]}
#> $ request:List of 7
#> ..- attr(*, "class")= chr "request"
#> - attr(*, "class")= chr "weblm"
# Print results
pandoc.table(jointProbabilities$results)
#> ------------------------------------
#> words probability
#> ---------------------- -------------
#> where -3.378
#>
#> is -2.607
#>
#> san -3.292
#>
#> francisco -4.051
#>
#> where is -3.961
#>
#> san francisco -4.086
#>
#> where is san francisco -7.998
#> ------------------------------------
}, error = function(err) {
# Print error
geterrmessage()
})
## End(Not run)