XML2Obs {XML2R} | R Documentation |
Parse XML files into a list of "observations"
Description
This function takes a collection of urls that point to XML files and coerces the relevant info into a list of observations. An "observation" is defined as a matrix with one row. An observation can also be thought of as a single instance of XML attributes (and value) for a particular level in the XML hierarchy. The names of the list reflect the XML node ancestory for which each observation was extracted from.
Usage
XML2Obs(
urls,
xpath,
append.value = TRUE,
as.equiv = TRUE,
url.map = FALSE,
local = FALSE,
quiet = FALSE,
...
)
Arguments
urls |
character vector. Either urls that point to an XML file online or a local XML file name. |
xpath |
XML XPath expression that is passed to getNodeSet. If missing, the entire root and all descendents are captured and returned (ie, tables = "/"). |
append.value |
logical. Should the XML value be appended for relevant observations? |
as.equiv |
logical. Should observations from two different files (but the same ancestory) have the same name returned? |
url.map |
logical. If TRUE, the 'url_key' column will contain a condensed url identifier (for each observation) and full urls will be stored in the "url_map" element. If FALSE, the full urls are included (for each observation) as a 'url' column and no "url_map" is included. |
local |
logical. Should urls be treated as paths to local files? |
quiet |
logical. Print file name currently being parsed? |
... |
arguments passed along to 'httr::GET' |
Details
It's worth noting that a "url_key" column is appended to each observation to help track the origin of each observation. The value of the "url_key" column is not the actual file name, but a simplified identifier to avoid unnecessarily repeating long file names for each observation. For this reason, an addition element (named "url_map") is added to the list of observations in case the actual file named want to be used.
Value
A list of "observations" and (possibly) the "url_map" element.
See Also
urlsToDocs, docsToNodes, nodesToList, listsToObs
Examples
## Not run:
urls <- c("http://gd2.mlb.com/components/game/mlb/year_2013/mobile/346180.xml",
"http://gd2.mlb.com/components/game/mlb/year_2013/mobile/346188.xml")
obs <- XML2Obs(urls)
table(names(obs))
# parses local files as well
players <- system.file("extdata", "players.xml", package = "XML2R")
obs2 <- XML2Obs(players, local = TRUE)
table(names(obs2))
## End(Not run)