pickFields {SEERaBomb} | R Documentation |
Pick SEER fields of interest
Description
Reduces the full set of SEER data fields to a smaller set of interest. SEER fields
are rows of the input and output dataframes
of this function. The output dataframe differs from the input dataframe not only in there being fewer rows
but also in there being an additional column needed by mkSEER()
downstream.
Usage
pickFields(sas,picks=c("casenum","reg","race","sex","agedx",
"yrbrth","seqnum","modx","yrdx","histo3",
"ICD9","COD","surv","radiatn","chemo"))
Arguments
sas |
A data frame created by |
picks |
Vector of names of variables of interest. This set should not be smaller than the default. |
Details
R binaries become too large if all of the fields are selected. SEERaBomb
is faster than SEER*Stat
because it tailors/streamlines the database to your interests. The default picks are a reasonable place to start; if you
determine later that you need more fields, you can always rebuild the binaries. Grabbing all fields is
discouraged, but if you want this anyway, note that you still need pickFields()
to create a data type column, i.e. you cannot bypass pickFields()
by sending the output of getFields()
straight to mkSEER()
.
Value
The SAS-based input data frame sas
, shortened to just the rows of picks
, and expanded to include
spacer rows of fields of no interest pooled into single strings: the width of such a spacer row is equal to
the distance in bytes between the fields of interest above and below it. This data frame is then
used by laf_open_fwf()
of LaF in mkSEER()
to read the SEER files. Proper use of this function, and of the SEER data in general,
requires an understanding of the contents of ‘seerdic.pdf’ in the ‘incidence’ directory of seerHome
.
Author(s)
Tom Radivoyevitch (radivot@ccf.org)
See Also
SEERaBomb-package, getFields, pickFields, mkSEER
Examples
## Not run:
library(SEERaBomb)
(df=getFields())
(df=pickFields(df))
## End(Not run)