fxml_importXMLFlat {flatxml}R Documentation

Handling flat XML files

Description

Reads an XML document into a flat dataframe structure.

Usage

fxml_importXMLFlat(path)

Arguments

path

Path to the XML document. Can be either a local path or a URL.

Details

The XML document is parsed and stored in a dataframe structure (flat XML). The first four columns of a flat XML dataframe are standard columns. Their names all end with a dot. These columns are:

The columns after these four standard columns represent the 'path' to the current element, starting from the root element of the XML document in column 5 all the way down to the current element. The number of columns of the dataframe is therefore determined by the depth of the hierarchical structure of the XML document. In this dataframe representation, the hierarchical structure of the XML document becomes very easy to understand. All flatxml functions work with this flat XML dataframe.

If an XML element has N attributes it is represented by (N+1) rows in the flat XML dataframe: one row for the value (with dataframe$value. being NA if the element has no value) and one for each attribute. In the attribute rows, the names of the attributes are stored in the attr. field, their respecitive values in the value. field. Even if there are multiple rows for one XML element, the elem. and elemid. fields still have the same value in all rows (because the rows belong to the same XML element).

Value

A dataframe containing the XML document in a flat structure. See the Details section for more information on its structure.

Author(s)

Joachim Zuckarelli joachim@zuckarelli.de

Examples

# Load example file with population data from United Nations Statistics Division
example <- system.file("worldpopulation.xml", package="flatxml")
# Create flat dataframe from XML
xml.dataframe <- fxml_importXMLFlat(example)


[Package flatxml version 0.1.1 Index]