R: Compute LexRanks from pairwise sentence similarities

lexRankFromSimil {lexRankr}

R Documentation

Compute LexRanks from pairwise sentence similarities

Description

Compute LexRanks from sentence pair similarities using the page rank algorithm or degree centrality the methods used to compute lexRank are discussed in "LexRank: Graph-based Lexical Centrality as Salience in Text Summarization."

Usage

lexRankFromSimil(s1, s2, simil, threshold = 0.2, n = 3,
  returnTies = TRUE, usePageRank = TRUE, damping = 0.85,
  continuous = FALSE)

Arguments

`s1`	A character vector of sentence IDs corresponding to the `s2` and `simil` arguments
`s2`	A character vector of sentence IDs corresponding to the `s1` and `simil` arguments
`simil`	A numeric vector of similarity values that represents the similarity between the sentences represented by the IDs in `s1` and `s2`.
`threshold`	The minimum simil value a sentence pair must have to be represented in the graph where lexRank is calculated.
`n`	The number of sentences to return as the extractive summary. The function will return the top `n` lexRanked sentences. See `returnTies` for handling ties in lexRank.
`returnTies`	`TRUE` or `FALSE` indicating whether or not to return greater than `n` sentence IDs if there is a tie in lexRank. If `TRUE`, the returned number of sentences will not be limited to `n`, but rather will return every sentence with a top 3 score. If `FALSE`, the returned number of sentences will be `<=n`. Defaults to `TRUE`.
`usePageRank`	`TRUE` or `FALSE` indicating whether or not to use the page rank algorithm for ranking sentences. If `FALSE`, a sentences unweighted centrality will be used as the rank. Defaults to `TRUE`.
`damping`	The damping factor to be passed to page rank algorithm. Ignored if `usePageRank` is `FALSE`.
`continuous`	`TRUE` or `FALSE` indicating whether or not to use continuous LexRank. Only applies if `usePageRank==TRUE`. If `TRUE`, `threshold` will be ignored and lexRank will be computed using a weighted graph representation of the sentences. Defaults to `FALSE`.

Value

A 2 column dataframe with columns sentenceId and value. sentenceId contains the ids of the top n sentences in descending order by value. value contains page rank score (if usePageRank==TRUE) or degree centrality (if usePageRank==FALSE).

References

http://www.cs.cmu.edu/afs/cs/project/jair/pub/volume22/erkan04a-html/erkan04a.html

Examples

lexRankFromSimil(s1=c("d1_1","d1_1","d1_2"), s2=c("d1_2","d2_1","d2_1"), simil=c(.01,.03,.5))

[Package lexRankr version 0.5.2 Index]