pickFields {SEERaBomb}R Documentation

Pick SEER fields of interest

Description

Reduces the full set of SEER data fields to a smaller set of interest. SEER fields are rows of the input and output dataframes of this function. The output dataframe differs from the input dataframe not only in there being fewer rows but also in there being an additional column needed by mkSEER() downstream.

Usage

pickFields(sas,picks=c("casenum","reg","race","sex","agedx",
        "yrbrth","seqnum","modx","yrdx","histo3",
        "ICD9","COD","surv","radiatn","chemo"))

Arguments

sas

A data frame created by getFields() using the SAS file found in the ‘incidence’ directory of seerHome, the root of the SEER ASCII data installation.

picks

Vector of names of variables of interest. This set should not be smaller than the default.

Details

R binaries become too large if all of the fields are selected. SEERaBomb is faster than SEER*Stat because it tailors/streamlines the database to your interests. The default picks are a reasonable place to start; if you determine later that you need more fields, you can always rebuild the binaries. Grabbing all fields is discouraged, but if you want this anyway, note that you still need pickFields() to create a data type column, i.e. you cannot bypass pickFields() by sending the output of getFields() straight to mkSEER().

Value

The SAS-based input data frame sas, shortened to just the rows of picks, and expanded to include spacer rows of fields of no interest pooled into single strings: the width of such a spacer row is equal to the distance in bytes between the fields of interest above and below it. This data frame is then used by laf_open_fwf() of LaF in mkSEER() to read the SEER files. Proper use of this function, and of the SEER data in general, requires an understanding of the contents of ‘seerdic.pdf’ in the ‘incidence’ directory of seerHome.

Author(s)

Tom Radivoyevitch (radivot@ccf.org)

See Also

SEERaBomb-package, getFields, pickFields, mkSEER

Examples

## Not run: 
library(SEERaBomb)
(df=getFields())
(df=pickFields(df))


## End(Not run)

[Package SEERaBomb version 2019.2 Index]