USSeerBG {micromapST} | R Documentation |
USSeerBG border group datasets to support use with U.S. 20 Seer areas/registries
Description
The micromapST function has the ability to generate linked micromaps for any geographical area. To specify the geographical area, the bordGrp call argument is used to specify the border group dataset for the geographical area. The USSeerBG border group dataset supports creating linked micromaps for the 20 Seer registries in the U. S. When the bordGrp call argument is set to USSeerBG, the appropriate name table (county names and abbreviations) and the 20 sub-areas (Seer registries) boundary data is loaded in micromapST. The user's data is then linked to the boundary data via the Seer registry's name, abbreviated, alias match or ID based on the table below.
The 20 U. S. Seer registries are the accepted registries as of January 2010 funded by NCI.
Details
The USSeerBG border group dataset contains the following data.frames:
- areaParms
- contains specific parameters for the border group
- areaNamesAbbrsIDs
- containing the names, abbreviations, numerical identifier and alias matching string for each of the 20 Seer registries.
- areaVisBorders
- the boundary point lists for each area.
- L2VisBorders
- the boundaries for an intermediate level. For Seer registry border group, L2VisBorders contains the boundaries for the 51 states and DC in the U. S to help provide a geographical reference of the registries to the states.
- RegVisBorders
- the boundaries for the 4 U. S. Census regions in the U. S in support of the region feature.
- L3VisBorders
- the boundary of the U. S.
The Seer Registries border group contains 20 Seer Registry sub-areas. Each registry has a row in the areaNamesAbbrsIDs data.frame and a set of polygons in the areaVisBorders data.frame datasets.
Regions are defined in this border group as the 4 census regions in the U. S. The regions feature is enable. The four census regions are: NorthEast, South, MidWest, and West. The states and Seer registries in each region are:
state | Seer Registries | region |
Alabama | <none> | South |
Alaska | Alaska Natives | West |
Arizona | Arizona Natives | West |
Arkansas | <none> | South |
California | California-LA, | West |
California-Other, | ||
California-SF, | ||
California-SJ | ||
Colorado | <none> | West |
Connecticut | Connecticut | NorthEast |
Delaware | <none> | South |
District of Columbia | <none> | South |
Florida | <none> | South |
Georgia | Georgia-Atlanta, | South |
Georgia-Other, | ||
Georgia-Rural | ||
Hawaii | Hawaii | West |
Idaho | <none> | West |
Illinois | <none> | MidWest |
Indiana | <none> | MidWest |
Iowa | Iowa | MidWest |
Kansas | <none> | MidWest |
Kentucky | Kentucky | South |
Louisiana | Louisiana | South |
Maine | <none> | NorthEast |
Maryland | <none> | South |
Massachusetts | <none> | NorthEast |
Michigan | Michigan-Detroit | MidWest |
Minnesota | <none> | MidWest |
Mississippi | <none> | South |
Missouri | <none> | MidWest |
Montana | <none> | West |
Nebraska | <none> | MidWest |
Nevada | <none> | West |
New Hampshire | <none> | NorthEast |
New Jersey | New Jersey | NorthEast |
New Mexico | New Mexico | West |
New York | <none> | NorthEast |
North Carolina | <none> | South |
North Dakota | <none> | MidWest |
Ohio | <none> | MidWest |
Oklahoma | Oklahoma-Cherokee | South |
Oregon | <none> | West |
Pennsylvania | <none> | NorthEast |
Rhode Island | <none> | NorthEast |
South Carolina | <none> | South |
South Dakota | <none> | MidWest |
Tennessee | <none> | South |
Texas | <none> | South |
Utah | Utah | West |
Vermont | <none> | NorthEast |
Virginia | <none> | South |
Washington | Washington-Seattle | South |
West Virginia | <none> | South |
Wisconsin | <none> | MidWest |
Wyoming | <none> | West |
The L3VisBorders dataset contains the outline of the United States.
The details on each of these data.frame structures can be found in the "bordGrp" section of this document. The areaNamesAbbrsIDs data.frame provides the linkages to the boundary data for each sub-area (registry) using the fullname, abbreviation, and numerical identifier for each country to the <statsDFrame> data based on the setting of the rowNames call argument.
A column or the data.frame row.names must match one of the types of names in the areaNamesAbbrsIDs data.frame name table. If the data row does not match a value in the name table, an warning is issued and the data is ignored. If no data is present for a sub-area (registry) in the name table, the sub-area (registry) is mapped but not colored.
The following are a list of the names, abbreviations, alias and IDs for each country in the USSeerBG border group.
Name | ab | alias string | id | counties | region |
Alaska Natives | AK-NAT | ALASKA NATIVES | 18 | all | West |
Arizona Natives | AZ-NAT | ARIZONA NATIVES | 20 | all | West |
California-LA | CA-LA | LOS ANGELES | 4 | Los Angeles | West |
California-SF | CA-SF | SAN FRANCISCO | 2 | Alameda, | West |
Contra Costa, | |||||
Marin, | |||||
San Francisco, | |||||
San Mateo | |||||
California-SJ | CA-SJ | SAN JOSE | 3 | Montersey | West |
San Benito, | |||||
Santa Clara, | |||||
Santa Cruz | |||||
California-Other | CA-OTH | CALIFORNIA EXCLUDING | 5 | all other counties | West |
Connecticut | CT | CONNECTICUT | 1 | all | NorthEast |
Georgia-Atlanta | GA-ATL | ATLANTA | 6 | Clayton, Cobb, DeKalb, | South |
Fulton, Gwinnett | |||||
Georgia-Rural | GA-RUR | RURAL GEORGIA | 8 | Glascock, Greene, Hancock, | South |
Jasper, Jefferson, Morgan, | |||||
Putnam, Taliaferro, Warren, | |||||
Washington | |||||
Georgia-Other | GA-OTH | GREATER GEORGIA | 7 | all other counties | South |
Hawaii | HI | HAWAII | 9 | all | West |
Iowa | IA | IOWA | 10 | all | MidWest |
Kentucky | KY | KENTUCKY | 14 | all | South |
Michigan-Detroit | MI-DET | DETROIT | 15 | Macomb, | MidWest |
Oakland, | |||||
Wayne | |||||
New Jersey | NJ | NEW JERSEY | 11 | all | NorthEast |
New Mexico | NM | NEW MEXICO | 12 | all | West |
Oklahoma-Cherokee | OK-CHE | OKLAHOMA | 19 | Adair, | South |
Cherokee, | |||||
Craig, | |||||
Delaware, | |||||
Mayes, | |||||
McIntosh, | |||||
Muskogee, | |||||
Nowata, | |||||
Ottawa, | |||||
Rogers, | |||||
Seqouyah, | |||||
Tulsa, | |||||
Wagnorer, | |||||
Washington | |||||
Utah | UT | UTAH | 16 | all | West |
Washington-Seattle | WA-SEA | SEATTLE | 17 | Clallam, | South |
Grays Harbor, | |||||
Island, | |||||
Jefferson, | |||||
King, | |||||
Kitsap, | |||||
Mason, | |||||
Pierce, | |||||
San Juan, | |||||
Skagit, | |||||
Snohomish, | |||||
Thurston, | |||||
Whatcom | |||||
The rowNames = alias and the regions = TRUE features are enabled in the USSeerBG border group.
The alias option is designed to allow the package to match the registry labels created by the Seer Stat website when exporting Seer data for analysis. The alias match is a "contains" match, so the registry field in the user data must "contain" the "alias" values listed in the above table. To help generalize the match, the user's registry value is stripped of any punctuation, control characters and multiple spaces (blanks, tabs, cr, lf) are reduced to a single blank and the string is converted to all upper case. Then the wild card match is performed.
The dataRegionOnly call parameter (when set to TRUE) instructs the package to only map the regions with Seer registers with data. The regions used are the four census regions: NorthEast, South, MidWest and West. The RegVisBorders data.frame contains the outline of each of these regions. For example: if Seer registry data is provided for the only the New Mexico, Utah and California Registries in the West region, then only the states and regional boundary for the West region are drawn.
The USSeerBG border group does not contain or support an alternate set of abbreviations. If rowNames is set to alt_ab, an warning is generated and the standard Seer registry abbreviations are used.
The following steps should be used to export data for micromapST's use from the SEER*Stat Website:
Log on to the SEER^Stat website.
Create the matrix of results you want in SEER*Stat.
Click on Matrix, Export, Results as Text File (if you created multiple matrices of results, make sure that the one you want to export is highlighted)
In the Matrix Export Options window, click on:
Output variables as Labels without quotes
Remove all thousands separators
Output variable names before data
Preserve matrix columns & rename fields
Leave defaults clicked for Line delimiter, Missing Character, and Field delimiter
Change names and locations of text and dictionary files from defaults to the appropriate name and directory location.
To read the resulting text file into R use the read.delim
function with header = TRUE.
Follow the read.delim
call with a str
function to verify the data was read correctly.
dataT <- read.delim("c:\datadir\seerstat.txt",header=FALSE) str(dataT)
Source
NCI
References
United States National Cancer Institute Seer Website at www.seer.cancer.gov; Seer Software at seer.cancer.gov/seerstat.; United States Census Bureau, Geography Division. "Census Regions and Divisions of the United States" (PDF). Retrieved 2013-01-10.