| filter_segment {jiebaR} | R Documentation | 
Filter segmentation result
Description
This function helps remove some words in the segmentation result.
Usage
filter_segment(input, filter_words, unit = 50)
Arguments
| input | a string vector | 
| filter_words | a string vector of words to be removed. | 
| unit | the length of word unit to use in regular expression, and the default is 50. Long list of a words forms a big regular expressions, it may or may not be accepted: the POSIX standard only requires up to 256 bytes. So we use unit to split the words in units. | 
Examples
filter_segment(c("abc","def"," ","."), c("abc"))
[Package jiebaR version 0.11 Index]