nodal_attributes {ergm} | R Documentation |
Specifying nodal attributes and their levels
Description
This document describes the ways to specify nodal
attributes or functions of nodal attributes and which levels for
categorical factors to include. For the helper functions to
facilitate this, see nodal_attributes-API
.
Usage
LARGEST(l, a)
SMALLEST(l, a)
COLLAPSE_SMALLEST(object, n, into)
Arguments
object , l , a , n , into |
|
Specifying nodal attributes
Term nodal attribute arguments, typically called attr
, attrs
, by
, or
on
are interpreted as follows:
- a character string
Extract the vertex attribute with this name.
- a character vector of length > 1
Extract the vertex attributes and paste them together, separated by dots if the term expects categorical attributes and (typically) combine into a covariate matrix if it expects quantitative attributes.
- a function
The function is called on the LHS network and additional arguments to
ergm_get_vattr()
, expected to return a vector or matrix of appropriate dimension. (Shorter vectors and matrix columns will be recycled as needed.)- a formula
The expression on the RHS of the formula is evaluated in an environment of the vertex attributes of the network, expected to return a vector or matrix of appropriate dimension. (Shorter vectors and matrix columns will be recycled as needed.) Within this expression, the network itself accessible as either
.
or.nw
. For example,nodecov(~abs(Grade-mean(Grade))/network.size(.))
would return the absolute difference of each actor's "Grade" attribute from its network-wide mean, divided by the network size.- an
AsIs
object created byI()
Use as is, checking only for correct length and type.
Any of these arguments may also be wrapped in or piped through
COLLAPSE_SMALLEST(attr, n, into)
or, attr %>% COLLAPSE_SMALLEST(n, into)
, a convenience function that will
transform the attribute by collapsing the smallest n
categories
into one, naming it into
. Note that into
must be of the same
type (numeric, character, etc.) as the vertex attribute in
question.
The name the nodal attribute receives in the statistic can be
overridden by setting a an attr()
-style attribute "name"
.
Specifying categorical attribute levels and their ordering
For categorical attributes, to select which levels are of interest
and their ordering, use the argument levels
. Selection of nodes (from
the appropriate vector of nodal indices) is likewise handled as the
selection of levels, using the argument nodes
. These arguments are interpreted
as follows:
- an expression wrapped in
I()
Use the given list of levels as is.
- a numeric or logical vector
Used for indexing of a list of all possible levels (typically, unique values of the attribute) in default older (typically lexicographic), i.e.,
sort(unique(attr))[levels]
. In particular,levels=TRUE
will retain all levels. Negative values exclude. Another special value isLARGEST
, which will refer to the most frequent category, so, say, to set such a category as the baseline, passlevels=-LARGEST
. In addition,LARGEST(n)
will refer to then
largest categories.SMALLEST
works analogously. Note that if there are ties in frequencies, they will be broken arbitrarily. To specify numeric or logical levels literally, wrap inI()
.NULL
Retain all possible levels; usually equivalent to passing
TRUE
.- a character vector
Use as is.
- a function
The function is called on the list of unique values of the attribute, the values of the attribute themselves, and the network itself, depending on its arity. Its return value is interpreted as above.
- a formula
The expression on the RHS of the formula is evaluated in an environment in which the network itself is accessible as
.nw
, the list of unique values of the attribute as.
or as.levels
, and the attribute vector itself as.attr
. Its return value is interpreted as above.- a matrix
For mixing effects (i.e.,
level2=
arguments), a matrix can be used to select elements of the mixing matrix, either by specifying a logical (TRUE
andFALSE
) matrix of the same dimension as the mixing matrix to select the corresponding cells or a two-column numeric matrix indicating giving the coordinates of cells to be used.
Note that levels
, nodes
, and others often have a default that is sensible for the
term in question.
Examples
library(magrittr) # for %>%
data(faux.mesa.high)
# Activity by grade with a baseline grade excluded:
summary(faux.mesa.high~nodefactor(~Grade))
# Name overrides:
summary(faux.mesa.high~nodefactor("Form"~Grade)) # Only for terms that don't use the LHS.
summary(faux.mesa.high~nodefactor(~structure(Grade,name="Form")))
# Retain all levels:
summary(faux.mesa.high~nodefactor(~Grade, levels=TRUE)) # or levels=NULL
# Use the largest grade as baseline (also Grade 7):
summary(faux.mesa.high~nodefactor(~Grade, levels=-LARGEST))
# Activity by grade with no baseline smallest two grades (11 and
# 12) collapsed into a new category, labelled 0:
table(faux.mesa.high %v% "Grade")
summary(faux.mesa.high~nodefactor((~Grade) %>% COLLAPSE_SMALLEST(2, 0),
levels=TRUE))
# Mixing between lower and upper grades:
summary(faux.mesa.high~mm(~Grade>=10))
# Mixing between grades 7 and 8 only:
summary(faux.mesa.high~mm("Grade", levels=I(c(7,8))))
# or
summary(faux.mesa.high~mm("Grade", levels=1:2))
# or using levels2 (see ? mm) to filter the combinations of levels,
summary(faux.mesa.high~mm("Grade",
levels2=~sapply(.levels,
function(l)
l[[1]]%in%c(7,8) && l[[2]]%in%c(7,8))))
# Here are some less complex ways to specify levels2. This is the
# full list of combinations of sexes in an undirected network:
summary(faux.mesa.high~mm("Sex", levels2=TRUE))
# Select only the second combination:
summary(faux.mesa.high~mm("Sex", levels2=2))
# Equivalently,
summary(faux.mesa.high~mm("Sex", levels2=-c(1,3)))
# or
summary(faux.mesa.high~mm("Sex", levels2=c(FALSE,TRUE,FALSE)))
# Select all *but* the second one:
summary(faux.mesa.high~mm("Sex", levels2=-2))
# Select via a mixing matrix: (Network is undirected and
# attributes are the same on both sides, so we can use either M or
# its transpose.)
(M <- matrix(c(FALSE,TRUE,FALSE,FALSE),2,2))
summary(faux.mesa.high~mm("Sex", levels2=M)+mm("Sex", levels2=t(M)))
# Select via an index of a cell:
idx <- cbind(1,2)
summary(faux.mesa.high~mm("Sex", levels2=idx))
# mm() term allows two-sided attribute formulas with different attributes:
summary(faux.mesa.high~mm(Grade~Race, levels2=TRUE))
# It is possible to have collapsing functions in the formula; note
# the parentheses around "~Race": this is because a formula
# operator (~) has lower precedence than pipe (|>):
summary(faux.mesa.high~mm(Grade~(~Race) %>% COLLAPSE_SMALLEST(3,"BWO"), levels2=TRUE))
# Some terms, such as nodecov(), accept matrices of nodal
# covariates. An certain R quirk means that columns whose
# expressions are not typical variable names have their names
# dropped and need to be adjusted. Consider, for example, the
# linear and quadratic effects of grade:
Grade <- faux.mesa.high %v% "Grade"
colnames(cbind(Grade, Grade^2)) # Second column name missing.
colnames(cbind(Grade, Grade2=Grade^2)) # Can be set manually,
colnames(cbind(Grade, `Grade^2`=Grade^2)) # even to non-variable-names.
colnames(cbind(Grade, Grade^2, deparse.level=2)) # Alternatively, deparse.level=2 forces naming.
rm(Grade)
# Therefore, the nodal attribute names are set as follows:
summary(faux.mesa.high~nodecov(~cbind(Grade, Grade^2))) # column names dropped with a warning
summary(faux.mesa.high~nodecov(~cbind(Grade, Grade2=Grade^2))) # column names set manually
summary(faux.mesa.high~nodecov(~cbind(Grade, Grade^2, deparse.level=2))) # using deparse.level=2
# Activity by grade with a random covariate. Note that setting an attribute "name" gives it a name:
randomcov <- structure(I(rbinom(network.size(faux.mesa.high),1,0.5)), name="random")
summary(faux.mesa.high~nodefactor(I(randomcov)))