AsciiGridImpute {yaImpute} | R Documentation |
Imputes/Predicts data for Ascii Grid maps
Description
AsciiGridImpute
finds nearest neighbor reference
observations for each point in the input grid maps and outputs maps
of selected Y-variables in a corresponding set of output grid maps.
AsciiGridPredict
applies a predict function to each point in the
input grid maps and outputs maps of the prediction(s) in corresponding
output grid maps (see Details).
One row of each grid map is read and processed at a time thereby avoiding the need to build huge objects in R that would be necessary if all the rows of all the maps were processed together.
Usage
AsciiGridImpute(object,xfiles,outfiles,xtypes=NULL,ancillaryData=NULL,
ann=NULL,lon=NULL,lat=NULL,rows=NULL,cols=NULL,
nodata=NULL,myPredFunc=NULL,...)
AsciiGridPredict(object,xfiles,outfiles,xtypes=NULL,lon=NULL,lat=NULL,
rows=NULL,cols=NULL,nodata=NULL,myPredFunc=NULL,...)
Arguments
object |
An object of class |
xfiles |
A |
outfiles |
One of these two forms:
|
xtypes |
A list of data type names that corresponds exactly to data type of the
maps listed in |
ancillaryData |
A data frame of Y-variables that may not have been used in
the original call to |
ann |
if NULL, the value is taken from |
lon |
if NULL, the value of |
lat |
if NULL, the value of |
rows |
if NULL, all rows from the input grids are used. Otherwise, rows is a 2-element
vector given the rows desired for the output. If the second element is greater than
the number of rows, the header value |
cols |
if NULL, all columns from the input grids are used. Otherwise, cols is a 2-element
vector given the columns desired for the output. If the first element is greater than
one, the header value |
nodata |
the |
myPredFunc |
called by |
... |
passed to |
Details
The input maps are assumed to be Asciigrid maps with 6-line headers
containing the following tags: NCOLS, NROWS, XLLCORNER, YLLCORNER,
CELLSIZE
and NODATA_VALUE
(case insensitive). The headers should be
identical for all input maps, a warning is issued if they are not.
It is critical that NODATA_VALUE
is the same on all input maps.
The function builds data frames from the input maps one row at a time and builds predictions using those data frames as newdata. Each row of the input maps is processed in sequence so that the entire maps are not stored in memory. The function works by opening all the input and reads one line (row) at a time from each. The output file(s) are created one line at time as the input maps are processed.
Use AsciiGridImpute
for objects builds with yai
,
otherwise use AsciiGridPredict
. When AsciiGridPredict
is
used, the following rules apply. First, when myPredFunc
is not
null it is called with the arguments object, newdata, ...
where the
new data is the data frame built from the input maps, otherwise the
generic predict
function is called with these same arguments.
When object
and myPredFunc
are both NULL a copy
newdata
used as the prediction. This is useful when lat, lon, rows,
or cols
are used in to subset the maps.
The NODATA_VALUE
is output for every NODATA_VALUE
found on any
grid cell on any one of the input maps (the predict function is not called for
these grid cells). NODATA_VALUE
is also output for any grid cell where
the predict function returns an NA
.
If factors are used as X-variables in
object
, the levels found the map data are checked against those used in
building the object
. If new levels are found, the corresponding output
map grid point is set to NODATA_VALUE
; the predict function is not called
for these cells as most predict functions will fail in these circumstances.
Checking on factors depends on object
containing a meaningful member
named xlevels
, as done for objects produced by lm
.
Asciigrid maps do not contain character data, only numbers. The numbers in the
maps are matched the xlevels
by subscript (the first entry in a level corresponds
to the numeric value 1 in the Asciigrid maps, the second to the number 2 and so
on). Care must be taken by the user to insure that the coding scheme used in
building the maps is identical to that used in building the object
. See Value for
information on how you can check the matching of these codes.
Value
An invisible
list containing the following named elements:
unexpectedNAs |
A data frame listing the map row numbers and the number
of |
illegalLevels |
A data frame listing levels found in the maps that
were not found in the |
outputLegend |
A data frame showing the relationship between levels in
the output maps and those found in |
inputLegend |
A data frame showing the relationship between levels found in
the input maps and those found in |
Author(s)
Nicholas L. Crookston ncrookston.fs@gmail.com
See Also
yai
, impute
, and newtargets
Examples
## These commands write new files to your working directory
# Use the iris data
data(iris)
# Section 1: Imagine that the iris are planted in a planting bed.
# The following set of commands create Asciigrid map
# files for four attributes to illustrate the planting layout.
# Change species from a character factor to numeric (the sp classes
# can not handle character data).
sLen <- matrix(iris[,1],10,15)
sWid <- matrix(iris[,2],10,15)
pLen <- matrix(iris[,3],10,15)
pWid <- matrix(iris[,4],10,15)
spcd <- matrix(as.numeric(iris[,5]),10,15)
# Create and change to a temp directory. You can delete these steps
# if you wish to keep the files in your working directory.
curdir <- getwd()
setwd(tempdir())
cat ("Using working dir",getwd(),"\n")
# Make maps of each variable.
header = c("NCOLS 15","NROWS 10","XLLCORNER 1","YLLCORNER 1",
"CELLSIZE 1","NODATA_VALUE -9999")
cat(file="slen.txt",header,sep="\n")
cat(file="swid.txt",header,sep="\n")
cat(file="plen.txt",header,sep="\n")
cat(file="pwid.txt",header,sep="\n")
cat(file="spcd.txt",header,sep="\n")
write.table(sLen,file="slen.txt",append=TRUE,col.names=FALSE,
row.names=FALSE)
write.table(sWid,file="swid.txt",append=TRUE,col.names=FALSE,
row.names=FALSE)
write.table(pLen,file="plen.txt",append=TRUE,col.names=FALSE,
row.names=FALSE)
write.table(pWid,file="pwid.txt",append=TRUE,col.names=FALSE,
row.names=FALSE)
write.table(spcd,file="spcd.txt",append=TRUE,col.names=FALSE,
row.names=FALSE)
# Section 2: Create functions to predict species
# set the random number seed so that example results are consistant
# normally, leave out this command
set.seed(12345)
# sample the data
refs <- sample(rownames(iris),50)
y <- data.frame(Species=iris[refs,5],row.names=rownames(iris[refs,]))
# build a yai imputation for the reference data.
rfNN <- yai(x=iris[refs,1:4],y=y,method="randomForest")
# make lists of input and output map files.
xfiles <- list(Sepal.Length="slen.txt",Sepal.Width="swid.txt",
Petal.Length="plen.txt",Petal.Width="pwid.txt")
outfiles1 <- list(distance="dist.txt",Species="spOutrfNN.txt",
useid="useindx.txt")
# map the imputation-based predictions for the input maps
AsciiGridImpute(rfNN,xfiles,outfiles1,ancillaryData=iris)
# read the asciigrids and get them ready to plot
spOrig <- t(as.matrix(read.table("spcd.txt",skip=6)))
sprfNN <- t(as.matrix(read.table("spOutrfNN.txt",skip=6)))
dist <- t(as.matrix(read.table("dist.txt",skip=6)))
# demonstrate the use of useid:
spViaUse <- read.table("useindx.txt",skip=6)
for (col in colnames(spViaUse)) spViaUse[,col]=as.character(y$Species[spViaUse[,col]])
# demonstrate how to use factors:
spViaLevels <- read.table("spOutrfNN.txt",skip=6)
for (col in colnames(spViaLevels)) spViaLevels[,col]=levels(y$Species)[spViaLevels[,col]]
identical(spViaLevels,spViaUse)
if (require(randomForest))
{
# build a randomForest predictor
rf <- randomForest(x=iris[refs,1:4],y=iris[refs,5])
AsciiGridPredict(rf,xfiles,list(predict="spOutrf.txt"))
sprf <- t(as.matrix(read.table("spOutrf.txt",skip=6)))
} else sprf <- NULL
# reset the directory to that where the example was started.
setwd(curdir)
par(mfcol=c(2,2),mar=c(1,1,2,1))
image(spOrig,main="Original",col=c("red","green","blue"),
axes=FALSE,useRaster=TRUE)
image(sprfNN,main="Using Impute",col=c("red","green","blue"),
axes=FALSE,useRaster=TRUE)
if (!is.null(sprf))
image(sprf,main="Using Predict",col=c("red","green","blue"),
axes=FALSE,useRaster=TRUE)
image(dist,main="Neighbor Distances",col=terrain.colors(15),
axes=FALSE,useRaster=TRUE)