remove.punct {word.alignment}R Documentation

Tokenizing and Removing Punctuation Marks

Description

It splits a given text into separated words and removes its punctuation marks.

Usage

remove.punct(text)

Arguments

text

an object.

Details

This function also considers numbers as a separated word.

Note that This function removes "dot"" only if it is at the end of the sentence, separately. Meanwhile, it does not eliminate dash and hyper.Because it is assumed that words containing these punctuations are one word.

Value

A vector of character string.

Author(s)

Neda Daneshgar and Majid Sarmad

Examples

x = "This is an  example-based MT!"  
remove.punct (x)

[Package word.alignment version 1.1 Index]