vn_samples {vietnameseConverter} | R Documentation |
Sample data frames in legacy Vietnamese encodings
Description
A list with four data frames. The data frames list the provinces of Viet Nam.
The first list item ($Unicode) shows the correct entries. The other three data frames show what loading data encoded in three different Vietnamese encodings would look like when loaded in R.
The list items are:
-
Unicode - Data frame with correct Unicode characters (reference)
-
TCVN3 - Data frame with TCVN3-encoded characters
-
VISCII - Data frame with VISCII-encoded characters
-
VPS - Data frame with VPS-encoded characters
Note that the last 3 are not actually encoded in their respective Vietnamese encodings. Instead, they show what a table in those encodings would look like when loaded into R (or more generally, a system that is not aware of the encodings).
Usage
data(vn_samples)
Format
A list with 4 data frames
Details
Each data frame contains 5 colums and 63 rows. The first two are character, the last three numeric.
-
Province_city - Name of province
-
Administrative_center - Administrative center of the province
-
Area_km2 - Area in km^2
-
Density_perkm2 - Population density (km^-2)
-
HDI_2012 - Human development index in 2012
The first two columns are character, the last three numeric. Only the character columns will be modified by calling decodeVN
, while the numeric columns will not be changed.
Factors are not converted. If your data frame contains factors, convert these to character first.
Note
The data frame is based on the table of provinces of Viet nam on Wikipedia https://en.wikipedia.org/wiki/Provinces_of_Vietnam with minor edits. The legacy Vietnamese encodings were simulated using the decodeVN
function and checked with this online conversion tool: http://www.enderminh.com/minh/vnconversions.aspx.