df.subset {misty} | R Documentation |
Subsetting Data Frames
Description
This function returns subsets of data frames which meet conditions.
Usage
df.subset(..., data, subset = NULL, drop = TRUE, check = TRUE)
Arguments
... |
an expression indicating variables to select from the data frame
specified in |
data |
a data frame that contains the variables specified in the
argument |
subset |
character string with a logical expression indicating rows to
keep, e.g., |
drop |
logical: if |
check |
logical: if |
Details
The argument ...
is used to specify an epxression indicating the
variables to select from the data frame specified in data
, e.g.,
df.subset(x1, x2, x3, data = dat)
. There are seven operators which
can be used in the expression ...
:
- Dot (
.
) Operator The dot operator is used to select all variables from the data frame specified in
data
. For example,df.subset(., data = dat)
selects all variables indat
. Note that this operator is similar to the functioneverything()
from the tidyselect package.- Plus (
+
) Operator The plus operator is used to select variables matching a prefix from the data frame specified in
data
. For example,df.subset(+x, data = dat)
selects all variables with the prefixx
. Note that this operator is equivalent to the functionstarts_with()
from the tidyselect package.- Minus (
-
) Operator The minus operator is used to select variables matching a suffix from the data frame specified in
data
. For example,df.subset(-y, data = dat)
selects all variables with the suffixy
. Note that this operator is equivalent to the functionends_with()
from the tidyselect package.- Tilde (
~
) Operator The tilde operator is used to select variables containg a word from the data frame specified in
data
. For example,df.subset(?al, data = dat)
selects all variables with the wordal
. Note that this operator is equivalent to the functioncontains()
from the tidyselect package.- Colon (
:
) operator The colon operator is used to select a range of consecutive variables from the data frame specified in
data
. For example,df.subset(x:z, data = dat)
selects all variables fromx
toz
. Note that this operator is equivalent to the:
operator from theselect
function in the dplyr package.- Double Colon (
::
) Operator The double colon operator is used to select numbered variables from the data frame specified in
data
. For example,df.subset(x1::x3, data = dat)
selects the variablesx1
,x2
, andx3
. Note that this operator is similar to the functionnum_range()
from the tidyselect package.- Exclamation Point (
!
) Operator The exclamation point operator is used to drop variables from the data frame specified in
data
or for taking the complement of a set of variables. For example,df.subset(., !x, data = dat)
selects all variables butx
indat
.,df.subset(., !~x, data = dat)
selects all variables but variables with the prefixx
, ordf.subset(x:z, !x1:x3, data = dat)
selects all variables fromx
toz
but excludes all variables fromx1
tox3
. Note that this operator is equivalent to the!
operator from theselect
function in the dplyr package.
Note that operators can be combined within the same function call. For example,
df.subset(+x, -y, !x2:x4, z, data = dat)
selects all variables with
the prefix x
and with the suffix y
but excludes variables from
x2
to x4
and select variable z
.
Value
Returns a data frame containing the variables and rows selected in the argument
...
and rows selected in the argument subset
.
Author(s)
Takuya Yanagida takuya.yanagida@univie.ac.at
References
Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) The New S Language. Wadsworth & Brooks/Cole.
See Also
df.duplicated
, df.merge
,
df.move
, df.rbind
,
df.rename
, df.sort
Examples
## Not run:
#-------------------------------------------------------------------------------
# Select single variables
# Example 1: Select 'Sepal.Length' and 'Petal.Width'
df.subset(Sepal.Length, Petal.Width, data = iris)
#-------------------------------------------------------------------------------
# Select all variables using the . operator
# Example 2a: Select all variables, select rows with 'Species' equal 'setosa'
# Note that single quotation marks ('') are needed to specify 'setosa'
df.subset(., data = iris, subset = "Species == 'setosa'")
# Example 2b: Select all variables, select rows with 'Petal.Length' smaller 1.2
df.subset(., data = iris, subset = "Petal.Length < 1.2")
#-------------------------------------------------------------------------------
# Select variables matching a prefix using the + operator
# Example 3: Select variables with prefix 'Petal'
df.subset(+Petal, data = iris)
#-------------------------------------------------------------------------------
# Select variables matching a suffix using the - operator
# Example 4: Select variables with suffix 'Width'
df.subset(-Width, data = iris)
#-------------------------------------------------------------------------------
# Select variables containing a word using the ~ operator
# Example 5: Select variables containing 'al'
df.subset(~al, data = iris)
#-------------------------------------------------------------------------------
# Select consecutive variables using the : operator
# Example 6: Select all variables from 'Sepal.Width' to 'Petal.Width'
df.subset(Sepal.Width:Petal.Width, data = iris)
#-------------------------------------------------------------------------------
# Select numbered variables using the :: operator
# Example 7: Select all variables from 'x1' to 'x3' and 'y1' to 'y3'
df.subset(x1::x3, y1::y3, data = anscombe)
#-------------------------------------------------------------------------------
# Drop variables using the ! operator
# Example 8a: Select all variables but 'Sepal.Width'
df.subset(., !Sepal.Width, data = iris)
# Example 8b: Select all variables but 'Sepal.Width' to 'Petal.Width'
df.subset(., !Sepal.Width:Petal.Width, data = iris)
#----------------------------------------------------------------------------
# Combine +, - , !, and : operators
# Example 9: Select variables with prefix 'x' and suffix '3', but exclude
# variables from 'x2' to 'x3'
df.subset(+x, -3, !x2:x3, data = anscombe)
## End(Not run)