R: Removes Arabic prefixes and suffixes

doStemming {arabicStemR}

R Documentation

Removes Arabic prefixes and suffixes

Description

Removes prefixes and suffixes, and can return a list matching the words to stemmed words. Does not stem different forms of Allah.

Usage

doStemming(texts, dontstem =  c('\u0627\u0644\u0644\u0647','\u0644\u0644\u0647'))

Arguments

`texts`	The original texts.
`dontstem`	By default, does not stem different forms of Allah

Value

doStemming returns a named list with the following elements:

`text`	The stemmed text
`stemmedWords`	A list matching the words and the stemmed words.

Author(s)

Rich Nielsen

Examples

## Create string with Arabic characters
x <- '\u0627\u0644\u0644\u063a\u0629 \u0627\u0644\u0639\u0631\u0628\u064a\u0629
 \u062c\u0645\u064a\u0644\u0629 \u062c\u062f\u0627'

## Remove prefixes and suffixes
y<-doStemming(x)
y$text
y$stemmedWords

[Package arabicStemR version 1.3 Index]