make_ames {AmesHousing} | R Documentation |
Create a Processed Version of the Ames Housing Data
Description
Create a Processed Version of the Ames Housing Data
Usage
make_ames()
make_ames_new()
make_ordinal_ames()
Details
For the processed version, the exact details can be found in
the code of make_ames
but a summary of the differences between
these data sets and ames_raw
is:
All factors are unordered.
-
PID
andOrder
are removed. Spaces and special characters in column names where changed to snake case. To be consistent,
SalePrice
was changed toSale_Price
.Many factor levels were changed to be more understandable (e.g.
Split_or_Multilevel
instead of080
)Many missing values were reset. For example, if the variable
Bsmt_Qual
was missing, this implies that there is no basement on the property. Instead of a missing value, the value ofBsmt_Qual
was changed toNo_Basement
. Similarly, numeric data pertaining to basements were set to zero where appropriate such as variablesBsmt_Full_Bath
andTotal_Bsmt_SF
.-
Garage_Yr_Blt
contained many missing data and was removed. Approximate longitude and latitude are included for the properties. Also, note that there are 6 properties with identical geotags. These are units within the same building. For some properties, updated versions of the PID identifiers were found and are replaced with new values.
make_ordinal_ames
is the same as make_ames
but many factor
variables were changed to class ordered
(see below).
The documentation for ames_raw()
contains descriptions of
the columns although, as noted above, the column names in
ames_raw()
are slightly different from the processed
versions.
make_ames_new()
creates a data set of new properties. These were populated
using less data sources than the original and lack a number of the condition
and quality. Both properties were unsold at the time of this writing.
Value
A tibble with the data.
Examples
ames <- make_ames()
nrow(ames)
summary(ames$Sale_Price)
ames_ord <- make_ordinal_ames()
ord_vars <- vapply(ames_ord, is.ordered, logical(1))
names(ord_vars)[ord_vars]