assoc_graph {netropy} | R Documentation |
Association Graphs
Description
Draws association graphs (graphical models) based on joint entropy values to detect and visualize different dependence structures among the variables in the dataframe.
Usage
assoc_graph(dat, cutoff = 0)
Arguments
dat |
dataframe with rows as observations and columns as variables. Variables must all be observed or transformed categorical with finite range spaces. |
cutoff |
the cutoff point for the edges to be drawn based on joint entropies. Default is 0 and draws all edges. |
Details
Draws association graphs based on given thresholds of joint entropy values between pairs of variables represented as nodes. Thickness of edges between pairs of nodes/variables indicates the strength of dependence between them. Isolated nodes are completely independent and paths through certain nodes/variables indicate conditional dependencies.
Value
A ggraph object with nodes representing all variables in dat
and edges
representing (the strength of) associations between them based on joint entropies.
Author(s)
Termeh Shafie
References
Frank, O., & Shafie, T. (2016). Multivariate entropy analysis of network data.
Bulletin of Sociological Methodology/Bulletin de Méthodologie Sociologique, 129(1), 45-63.
Nowicki, K., Shafie, T., & Frank, O. (Forthcoming 2022). Statistical Entropy Analysis of Network Data.
See Also
Examples
library(ggraph)
# use internal data set
data(lawdata)
df.att <- lawdata[[4]]
# three steps of data editing:
# 1. categorize variables 'years' and 'age' based on
# approximately three equally size groups (values based on cdf)
# 2. make sure all outcomes start from the value 0 (optional)
# 3. remove variable 'senior' as it consists of only unique values (thus redundant)
df.att.ed <- data.frame(
status = df.att$status,
gender = df.att$gender,
office = df.att$office-1,
years = ifelse(df.att$years<=3,0,
ifelse(df.att$years<=13,1,2)),
age = ifelse(df.att$age<=35,0,
ifelse(df.att$age<=45,1,2)),
practice = df.att$practice,
lawschool= df.att$lawschool-1)
# association graph based on cutoff 0.15
assoc_graph(df.att.ed, 0.15)