| df.subset {misty} | R Documentation |
Subsetting Data Frames
Description
This function returns subsets of data frames which meet conditions.
Usage
df.subset(..., data, subset = NULL, drop = TRUE, check = TRUE)
Arguments
... |
an expression indicating variables to select from the data frame
specified in |
data |
a data frame that contains the variables specified in the
argument |
subset |
character string with a logical expression indicating rows to
keep, e.g., |
drop |
logical: if |
check |
logical: if |
Details
The argument ... is used to specify an epxression indicating the
variables to select from the data frame specified in data, e.g.,
df.subset(x1, x2, x3, data = dat). There are seven operators which
can be used in the expression ...:
- Dot (
.) Operator The dot operator is used to select all variables from the data frame specified in
data. For example,df.subset(., data = dat)selects all variables indat. Note that this operator is similar to the functioneverything()from the tidyselect package.- Plus (
+) Operator The plus operator is used to select variables matching a prefix from the data frame specified in
data. For example,df.subset(+x, data = dat)selects all variables with the prefixx. Note that this operator is equivalent to the functionstarts_with()from the tidyselect package.- Minus (
-) Operator The minus operator is used to select variables matching a suffix from the data frame specified in
data. For example,df.subset(-y, data = dat)selects all variables with the suffixy. Note that this operator is equivalent to the functionends_with()from the tidyselect package.- Tilde (
~) Operator The tilde operator is used to select variables containg a word from the data frame specified in
data. For example,df.subset(?al, data = dat)selects all variables with the wordal. Note that this operator is equivalent to the functioncontains()from the tidyselect package.- Colon (
:) operator The colon operator is used to select a range of consecutive variables from the data frame specified in
data. For example,df.subset(x:z, data = dat)selects all variables fromxtoz. Note that this operator is equivalent to the:operator from theselectfunction in the dplyr package.- Double Colon (
::) Operator The double colon operator is used to select numbered variables from the data frame specified in
data. For example,df.subset(x1::x3, data = dat)selects the variablesx1,x2, andx3. Note that this operator is similar to the functionnum_range()from the tidyselect package.- Exclamation Point (
!) Operator The exclamation point operator is used to drop variables from the data frame specified in
dataor for taking the complement of a set of variables. For example,df.subset(., !x, data = dat)selects all variables butxindat.,df.subset(., !~x, data = dat)selects all variables but variables with the prefixx, ordf.subset(x:z, !x1:x3, data = dat)selects all variables fromxtozbut excludes all variables fromx1tox3. Note that this operator is equivalent to the!operator from theselectfunction in the dplyr package.
Note that operators can be combined within the same function call. For example,
df.subset(+x, -y, !x2:x4, z, data = dat) selects all variables with
the prefix x and with the suffix y but excludes variables from
x2 to x4 and select variable z.
Value
Returns a data frame containing the variables and rows selected in the argument
... and rows selected in the argument subset.
Author(s)
Takuya Yanagida takuya.yanagida@univie.ac.at
References
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) The New S Language. Wadsworth & Brooks/Cole.
See Also
df.duplicated, df.merge,
df.move, df.rbind,
df.rename, df.sort
Examples
## Not run:
#-------------------------------------------------------------------------------
# Select single variables
# Example 1: Select 'Sepal.Length' and 'Petal.Width'
df.subset(Sepal.Length, Petal.Width, data = iris)
#-------------------------------------------------------------------------------
# Select all variables using the . operator
# Example 2a: Select all variables, select rows with 'Species' equal 'setosa'
# Note that single quotation marks ('') are needed to specify 'setosa'
df.subset(., data = iris, subset = "Species == 'setosa'")
# Example 2b: Select all variables, select rows with 'Petal.Length' smaller 1.2
df.subset(., data = iris, subset = "Petal.Length < 1.2")
#-------------------------------------------------------------------------------
# Select variables matching a prefix using the + operator
# Example 3: Select variables with prefix 'Petal'
df.subset(+Petal, data = iris)
#-------------------------------------------------------------------------------
# Select variables matching a suffix using the - operator
# Example 4: Select variables with suffix 'Width'
df.subset(-Width, data = iris)
#-------------------------------------------------------------------------------
# Select variables containing a word using the ~ operator
# Example 5: Select variables containing 'al'
df.subset(~al, data = iris)
#-------------------------------------------------------------------------------
# Select consecutive variables using the : operator
# Example 6: Select all variables from 'Sepal.Width' to 'Petal.Width'
df.subset(Sepal.Width:Petal.Width, data = iris)
#-------------------------------------------------------------------------------
# Select numbered variables using the :: operator
# Example 7: Select all variables from 'x1' to 'x3' and 'y1' to 'y3'
df.subset(x1::x3, y1::y3, data = anscombe)
#-------------------------------------------------------------------------------
# Drop variables using the ! operator
# Example 8a: Select all variables but 'Sepal.Width'
df.subset(., !Sepal.Width, data = iris)
# Example 8b: Select all variables but 'Sepal.Width' to 'Petal.Width'
df.subset(., !Sepal.Width:Petal.Width, data = iris)
#----------------------------------------------------------------------------
# Combine +, - , !, and : operators
# Example 9: Select variables with prefix 'x' and suffix '3', but exclude
# variables from 'x2' to 'x3'
df.subset(+x, -3, !x2:x3, data = anscombe)
## End(Not run)