dataset-to-R {crunch}R Documentation

as.data.frame method for CrunchDataset

Description

This method is defined principally so that you can use a CrunchDataset as a data argument to other R functions (such as stats::lm()) without needing to download the whole dataset. You can, however, choose to download a true data.frame.

Usage

## S3 method for class 'CrunchDataset'
as.data.frame(
  x,
  row.names = NULL,
  optional = FALSE,
  force = FALSE,
  categorical.mode = "factor",
  row.order = NULL,
  include.hidden = TRUE,
  ...
)

## S3 method for class 'CrunchDataFrame'
as.data.frame(
  x,
  row.names = NULL,
  optional = FALSE,
  include.hidden = attr(x, "include.hidden"),
  ...
)

Arguments

x

a CrunchDataset or CrunchDataFrame

row.names

part of as.data.frame signature. Ignored.

optional

part of as.data.frame signature. Ignored.

force

logical: actually coerce the dataset to data.frame, or leave the columns as unevaluated promises. Default is FALSE.

categorical.mode

what mode should categoricals be pulled as? One of factor, numeric, id (default: factor)

row.order

vector of indices. Which, and their order, of the rows of the dataset should be presented as (default: NULL). If NULL, then the Crunch Dataset order will be used.

include.hidden

logical: should hidden variables be included? (default: TRUE)

...

additional arguments passed to as.data.frame (default method).

Details

By default, the as.data.frame method for CrunchDataset does not return a data.frame but instead CrunchDataFrame, which behaves like a data.frame without bringing the whole dataset into memory. When you access the variables of a CrunchDataFrame, you get an R vector, rather than a CrunchVariable. This allows modeling functions that require select columns of a dataset to retrieve only those variables from the remote server, rather than pulling the entire dataset into local memory.

If you call as.data.frame() on a CrunchDataset with force = TRUE, you will instead get a true data.frame. You can also get this data.frame by calling as.data.frame on a CrunchDataFrame (effectively calling as.data.frame on the dataset twice)

When a data.frame is returned, the function coerces Crunch Variable values into their R equivalents using the following rules:

Column names in the data.frame are the variable/subvariable aliases.

Value

When called on a CrunchDataset, the method returns an object of class CrunchDataFrame unless force = TRUE, in which case the return is a data.frame. For CrunchDataFrame, the method returns a data.frame.

See Also

as.vector()


[Package crunch version 1.30.4 Index]