R: Longitudinal classroom friendship network and behavior...

knecht {xergm.common}

R Documentation

Longitudinal classroom friendship network and behavior (Andrea Knecht)

Description

The Knecht dataset contains the friendship network of 26 pupils in a Dutch school class measured at four time points along with several demographic and behavioral covariates like age, sex, ethnicity, religion, delinquency, alcohol consumption, primary school co-attendance, and school advice. Some of these covariates are constant while others vary over time.

The full dataset (see Knecht 2006 and 2008) contains a large number of classrooms while the dataset presented here is an excerpt based on one single classroom. This excerpt was first used in a tutorial for the software Siena and the corresponding R package RSiena (Snijders, Steglich and van de Bunt 2010). The following description was largely copied from the original data description provided on the homepage of the Siena project (see below for the URL).

The data were collected between September 2003 and June 2004 by Andrea Knecht, supervised by Chris Baerveldt, at the Department of Sociology of the University of Utrecht (NL). The entire study is reported in Knecht (2008). The project was funded by the Netherlands Organisation for Scientific Research NWO, grant 401-01-554. The 26 students were followed over their first year at secondary school during which friendship networks as well as other data were assessed at four time points at intervals of three months. There were 17 girls and 9 boys in the class, aged 11–13 at the beginning of the school year. Network data were assessed by asking students to indicate up to twelve classmates which they considered good friends. Delinquency is defined as a rounded average over four types of minor delinquency (stealing, vandalism, graffiti, and fighting), measured in each of the four waves of data collection. The five-point scale ranged from ‘never’ to 'more than 10 times', and the distribution is highly skewed. In a range of 1–5, the mode was 1 at all four waves, the average rose over time from 1.4 to 2.0, and the value 5 was never observed.

Usage

data(knecht)

Format

Note: the data have to be transformed before they can be used with btergm and related packages (see examples below).

friendship: is a list of adjacency matrices at four time points, containing friendship nominations of the column node by the row node. The following values are used: 0 = no, 1 = yes, NA = missing, 10 = not a member of the classroom (structural zero).
demographics: is a data frame with 26 rows (the pupils) and four demographic variables about the pupils:

sex (1 = girl, 2 = boy)
age (in years)
ethnicity (1 = Dutch, 2 = other, 0 = missing)
religion (1 = Christian, 2 = non-religious, 3 = non-Christian religion, 0 = missing)

primary: is a 26 x 26 matrix indicating whether two pupils attended the same primary school. 0 = no, 1 = yes.
delinquency: is a data frame with 26 rows (the pupils) and four columns (the four time steps). It contains the rounded average of four items (stealing, vandalizing, fighting, graffiti). Categories: frequency over last three months, 1 = never, 2 = once, 3 = 2–4 times, 4 = 5–10 times, 5 = more than 10 times; 0 = missing.
alcohol: is a data frame with 26 rows (the pupils) and 3 columns (waves 2, 3, and 4). It contains data on alcohol use (“How often did you drink alcohol with friends in the last three months?”). Categories: 1 = never, 2 = once, 3 = 2–4 times, 4 = 5–10 times, 5 = more than 10 times; 0 = missing.
advice: is a data frame with one variable, “school advice”, the assessment given at the end of primary school about the school capabilities of the pupil (4 = low, 8 = high, 0 = missing)

Source

The data were gathered by Andrea Knecht, as part of her PhD research, building on methods developed by Chris Baerveldt, initiator and supervisor of the project. The project is funded by the Netherlands Organisation for Scientific Research NWO, grant 401-01-554, and is part of the research program “Dynamics of Networks and Behavior” with principle investigator Tom A. B. Snijders.

Complete original data: https://easy.dans.knaw.nl/ui/datasets/id/easy-dataset:48665
This excerpt in Siena format: http://www.stats.ox.ac.uk/~snijders/siena/klas12b.zip
Siena dataset description: http://www.stats.ox.ac.uk/~snijders/siena/tutorial2010_data.htm

Permission to redistribute this dataset along with this package was granted by Andrea Knecht on April 17, 2014. Questions about the data or the original study should be directed to her.

References

Knecht, Andrea (2006): Networks and Actor Attributes in Early Adolescence [2003/04]. Utrecht, The Netherlands Research School ICS, Department of Sociology, Utrecht University. (ICS-Codebook no. 61).

Knecht, Andrea (2008): Friendship Selection and Friends' Influence. Dynamics of Networks and Actor Attributes in Early Adolescence. PhD Dissertation, University of Utrecht. http://dspace.library.uu.nl/bitstream/handle/1874/25950/full.pdf.

Knecht, Andrea, Tom A. B. Snijders, Chris Baerveldt, Christian E. G. Steglich, and Werner Raub (2010): Friendship and Delinquency: Selection and Influence Processes in Early Adolescence. Social Development 19(3): 494–514. http://dx.doi.org/10.1111/j.1467-9507.2009.00564.x.

Leifeld, Philip and Skyler J. Cranmer (2014): A Theoretical and Empirical Comparison of the Temporal Exponential Random Graph Model and the Stochastic Actor-Oriented Model. Paper presented at the 7th Political Networks Conference, McGill University, Montreal, Canada, May 30. http://arxiv.org/abs/1506.06696.

Leifeld, Philip, Skyler J. Cranmer and Bruce A. Desmarais (2017): Temporal Exponential Random Graph Models with btergm: Estimation and Bootstrap Confidence Intervals. Journal of Statistical Software.

Snijders, Tom A. B., Christian E. G. Steglich, and Gerhard G. van de Bunt (2010): Introduction to Actor-Based Models for Network Dynamics. Social Networks 32: 44–60. http://dx.doi.org/10.1016/j.socnet.2009.02.004.

Steglich, Christian E. G. and Andrea Knecht (2009): Die statistische Analyse dynamischer Netzwerkdaten. In: Stegbauer, Christian and Roger Haeussling (editors), Handbuch der Netzwerkforschung, Wiesbaden: Verlag fuer Sozialwissenschaften.

Examples

## Not run: 
# ====================================================================
# The following example was taken from the JSS article about btergm 
# that is referenced above (Leifeld, Cranmer and Desmarais 2017).
# ====================================================================

require("texreg")
require("sna")
require("xergm")
require("RSiena")
data("knecht")

# step 1: make sure the network matrices have node labels
for (i in 1:length(friendship)) {
  rownames(friendship[[i]]) <- 1:nrow(friendship[[i]])
  colnames(friendship[[i]]) <- 1:ncol(friendship[[i]])
}
rownames(primary) <- rownames(friendship[[1]])
colnames(primary) <- colnames(friendship[[1]])
sex <- demographics$sex
names(sex) <- 1:length(sex)

# step 2: imputation of NAs and removal of absent nodes:
friendship <- handleMissings(friendship, na = 10, method = "remove")
friendship <- handleMissings(friendship, na = NA, method = "fillmode")

# step 3: add nodal covariates to the networks
for (i in 1:length(friendship)) {
  s <- adjust(sex, friendship[[i]])
  friendship[[i]] <- network(friendship[[i]])
  friendship[[i]] <- set.vertex.attribute(friendship[[i]], "sex", s)
  idegsqrt <- sqrt(degree(friendship[[i]], cmode = "indegree"))
  friendship[[i]] <- set.vertex.attribute(friendship[[i]],
      "idegsqrt", idegsqrt)
  odegsqrt <- sqrt(degree(friendship[[i]], cmode = "outdegree"))
  friendship[[i]] <- set.vertex.attribute(friendship[[i]],
      "odegsqrt", odegsqrt)
}
sapply(friendship, network.size)

# step 4: plot the networks
pdf("knecht.pdf")
par(mfrow = c(2, 2), mar = c(0, 0, 1, 0))
for (i in 1:length(friendship)) {
  plot(network(friendship[[i]]), main = paste("t =", i),
  usearrows = TRUE, edge.col = "grey50")
}
dev.off()

# step 5: estimate TERGMS without and with temporal dependencies
model.2a <- btergm(friendship ~ edges + mutual + ttriple +
    transitiveties + ctriple + nodeicov("idegsqrt") +
    nodeicov("odegsqrt") + nodeocov("odegsqrt") +
    nodeofactor("sex") + nodeifactor("sex") + nodematch("sex") +
    edgecov(primary), R = 100)

model.2b <- btergm(friendship ~ edges + mutual + ttriple +
    transitiveties + ctriple + nodeicov("idegsqrt") +
    nodeicov("odegsqrt") + nodeocov("odegsqrt") +
    nodeofactor("sex") + nodeifactor("sex") + nodematch("sex") +
    edgecov(primary) + delrecip + memory(type = "stability"),
    R = 100)

# step 6: alternatively, estimate via MCMC-MLE:
model.2d <- mtergm(friendship ~ edges + mutual + ttriple +
    transitiveties + ctriple + nodeicov("idegsqrt") +
    nodeicov("odegsqrt") + nodeocov("odegsqrt") +
    nodeofactor("sex") + nodeifactor("sex") + nodematch("sex") +
    edgecov(primary) + delrecip + memory(type = "stability"),
    control = control.ergm(MCMC.samplesize = 5000, MCMC.interval = 2000))

# step 7: GOF assessment with out-of-sample prediction
model.2e <- btergm(friendship[1:3] ~ edges + mutual + ttriple +
    transitiveties + ctriple + nodeicov("idegsqrt") +
    nodeicov("odegsqrt") + nodeocov("odegsqrt") +
    nodeofactor("sex") + nodeifactor("sex") + nodematch("sex") +
    edgecov(primary) + delrecip + memory(type = "stability"),
    R = 100)

gof.2e <- gof(model.2e, nsim = 100, target = friendship[[4]],
    formula = friendship[3:4] ~ edges + mutual + ttriple +
    transitiveties + ctriple + nodeicov("idegsqrt") +
    nodeicov("odegsqrt") + nodeocov("odegsqrt") +
    nodeofactor("sex") + nodeifactor("sex") + nodematch("sex") +
    edgecov(primary) + delrecip + memory(type = "stability"),
    coef = coef(model.2b), statistics = c(esp, dsp, geodesic,
    deg, triad.undirected, rocpr))
pdf("gof-2e.pdf", width = 8, height = 6)
plot(gof.2e)
dev.off()

## End(Not run)

[Package xergm.common version 1.7.8 Index]