R: Flexibly load from a text or binary file, accepts multiple...

reader {reader}

R Documentation

Flexibly load from a text or binary file, accepts multiple file formats.

Description

Uses file extension to distinguish between binary, csv or other text formats. Then tries to automatically determine other parameters necessary to read the file. Will attempt to detect the delimiter, and detect whether there is a heading/column names, and whether the first column should be rownames, or left as a data column. Internal calls to standard file reading functions use 'stringsAsFactors=FALSE'.

Usage

reader(fn, dir = "", want.type = NULL, def = "\t", force.read = TRUE,
  header = NA, h.test.p = 0.05, quiet = TRUE, treatas = NULL,
  override = FALSE, more.types = NULL, auto.vec = TRUE, one.byte = TRUE,
  ...)

Arguments

`fn`	filename (with or without path if dir is specified)
`dir`	optional directory if separate path/filename is preferred
`want.type`	if loading a binary file with multiple objects, specify here the is() type of object you are trying to load
`def`	the default delimiter to try first
`force.read`	attempt to read the file even if the file type looks unsupported
`header`	presence of a header should be autodetected, but can specify header status if you don't trust the autodetection
`h.test.p`	p value to discriminate between number of characters in a column name versus a column value (sensitivity parameter for automatic header detection)
`quiet`	run without messages and warnings
`treatas`	a standard file extension, e.g, 'txt', to treat file as
`override`	assume first col is rownames, regardless of heuristic
`more.types`	optionally add more file types which are read as text
`auto.vec`	if the file seems to only have a single column, automatically return the result as a vector rather than a dataframe with 1 column
`one.byte`	logical parameter, passed to 'get.delim', whether to look for only 1-byte delimiters, to also search for 'whitespace' which is a multibyte (wildcard) delimiter type. Use one.byte = FALSE, to read fixed width files, e.g, many plink files.
`...`	further arguments to the function used by 'reader' to parse the file, e.g, depending on file.type, can be read.table(), read.delim(), read.csv().

Value

returns the most appropriate object depending on the file type, which is usually a data.frame except for binary files

Author(s)

Nicholas Cooper nick.cooper@cimr.cam.ac.uk

Examples

orig.dir <- getwd(); setwd(tempdir()); # move to temporary dir
# create some datasets
df <- data.frame(ID=paste("ID",101:110,sep=""),
  scores=sample(70,10,TRUE)+30,age=sample(7,10,TRUE)+11)
DNA <- apply(matrix(c("A","C","G","T")[sample(4,100,TRUE)],nrow=10),
                                                1,paste,collapse="")
fix.wid <- c("    MyVal    Results        Check",
  "    0.234      42344          yes",
  "    0.334        351          yes","    0.224         46           no",
  "    0.214     445391          yes")
# save data to various file formats
test.files <- c("temp.txt","temp2.txt","temp3.csv",
                              "temp4.rda","temp5.fasta","temp6.txt")
write.table(df,file=test.files[1],col.names=FALSE,row.names=FALSE,sep="|",quote=TRUE)
write.table(df,file=test.files[2],col.names=TRUE,row.names=TRUE,sep="\t",quote=FALSE)
write.csv(df,file=test.files[3])
save(df,file=test.files[4])
writeLines(DNA,con=test.files[5])
writeLines(fix.wid,con=test.files[6])
# use the same reader() function call to read in each file
for(cc in 1:length(test.files)) {
  cat(test.files[cc],"\n")
  myobj <- reader(test.files[cc])  # add 'quiet=FALSE' to see some working
  print(myobj); cat("\n\n")
}
# inspect files before deleting if desired
unlink(test.files) 
# myobj <- reader(file.choose()); myobj # run this to attempt opening a file
setwd(orig.dir) # reset working directory to original

[Package reader version 1.0.6 Index]