nfirst2lower {word.alignment}R Documentation

Make a String's First n Characters Lowercase

Description

Converts uppercase to lowercase letters for the first n characters of a character string.

Usage

nfirst2lower(x, n = 1, first = TRUE, second = FALSE)

Arguments

x

a character string.

n

an integer. Number of characters that we want to convert.

first

logical. If TRUE, it converts the n first characters into lowercase.

second

logical. If TRUE, it checks if the second letter of x is uppercase, the whole word will be converted to lower.

Details

It is a function to convert some uppercase letters into lowercase for which words with uppercase second letter. If tolower in base R is used, it will be sometimes created a problem for proper nouns. Because, as we know, a name or proper noun starts with capital letter and we do not want to convert them into lowercase. But sometimes there are some words which are not a name or proper noun and displayed in capital letters. These words are the target of this function.

If we have a text of several sentences and we want to convert the first n letters of every sentence to lowercase, separately. We have to split text to sentences, furthermore we should consider first=TRUE and apply the function for each sentence (see the examples below).

If we have a list, it works fine.

Value

A character string.

Note

Because of all sentences begin with uppercase letters, first=TRUE is considered as a default. But, if the second character of a word be capital, it is usually concluded that all its characters are capital. In this case, you can consider second=TRUE. Of course, there are some exceptations in these cases that they can be ignored (see the examples below).

In general, if there are not a lot of proper nouns in your text string, we suggest you to use tolower in base R. As an ability of this function, lower is considered as a third argument.

Author(s)

Neda Daneshgar and Majid Sarmad.

See Also

tolower

Examples

# x is a list

x=list('W-A for an English-Persian Parallel Corpus (Mizan).','ALIGNMENT is a link between words.')

nfirst2lower(x, n=8) ## nfirst2lower(x, n=8) is not a list

y='MT is the automatic translation. SMT is one of the methods of MT.'

nfirst2lower(y) # only run for the first sentence

u1=unlist(strsplit(y, ". ", fixed = TRUE))
sapply(1:length(u1),function(x)nfirst2lower(u1[x])) ## run for all sentences

h = 'It is a METHOD for this function.'
nfirst2lower (h, second = TRUE) #only run for the first word

h1 = strsplit(h, ' ')[[1]]
nfirst2lower(h1, second = TRUE) # run for all words

[Package word.alignment version 1.1 Index]