csv2txt {chinese.misc}R Documentation

Write Texts in CSV into Many TXT/RTF Files

Description

The function writes texts in a given .csv file into separated .txt/.rtf files with file names added.

Usage

csv2txt(
  csv,
  folder,
  which,
  header = TRUE,
  na_in_csv = c(NA, "", " ", "?", "NA", "999"),
  na_in_txt = " ",
  name_col = NULL,
  ext = "txt"
)

Arguments

csv

a .csv file. One of its columns contains texts to be written.

folder

a name of a folder that stores the .txt/.rtf files created by the function. The folder may already exist. If it does not exist, the function will try to create it recursively. If it cannot be created, an error will be raised. See dir.create. Note: a name that contains no punctuation is preferred.

which

a number: which column of the csv file contains texts.

header

should the .csv file be read with its first row as header? This argument is passed to read.csv. Default is TRUE.

na_in_csv

character vector indicating what content in the .csv file's cells should be taken as NA. The default values are "", " ", "?", "NA", "999"; and you can specify other values. But whatever you specify, the default values will always be taken as NA. If you do not provide a character vector, the default values are used.

na_in_txt

a length 1 character specifying what to write into a .txt file if a csv cell is NA. The default is " " (a space).

name_col

a length 1 number to indicate which column of your data should be taken as filenames. If it is NULL (default), a unique number will be given to each file, See Detail. If a cell is taken to be NA, it will be converted to ""; if it is too long, only the first 90 characters are used; one or more blanks and punctuations will be replaced by " " (a space).

ext

the extension of files to be written. Should be "txt", "rtf" or "". If it is not one of the three, it is set to "".

Details

In writing .txt/.rtf files, the function gives each file a unique number as part of its filename. The mechanism is as follows: suppose you have 1234 files, as this number has four digits, a series of numbers 0001, 0002,...0012,...0300,...1234 are assigned rather than 1, 2,...12,...300,...1234. There are several reasons to do this: first, if name_col is NULL, this procedure automatically assigns names. Second, the column you specify may have duplicate names. Third, even the column does not have duplicate names, the process the function modifies the names to make them valid may also produce duplicate names. Fourth, numbers with full digits make it easy to sort them in any software.

Value

nothing is returned and .txt/rtf files are written into the folder.

Examples

## Not run: 
# First, we create a csv file
x1 <- file.path(find.package("base"), "CITATION")
x2 <- file.path(find.package("base"), "DESCRIPTION")
txt2csv(x1, x2, must_txt = FALSE, csv = "x1x2csv.csv")
# Now try to write files
wd <- getwd()
wd <- gsub("/$|\\\\$", "", wd)
f <- paste(wd, "x1x2csv", sep="/")
csv2txt(csv = "x1x2csv.csv", folder = f, which = 3, ext = "")

## End(Not run)

[Package chinese.misc version 0.2.3 Index]