netinf {NetworkInference} | R Documentation |
Infer latent diffusion network
Description
Infer a network of diffusion ties from a set of cascades. Each cascade is defined by pairs of node ids and infection times.
Usage
netinf(cascades, trans_mod = "exponential", n_edges = NULL,
p_value_cutoff = NULL, params = NULL, quiet = FALSE,
trees = FALSE)
Arguments
cascades |
an object of class cascade containing node and cascade
information. See |
trans_mod |
character, indicating the choice of model:
|
n_edges |
integer, number of edges to infer. Leave unspecified if using
|
p_value_cutoff |
numeric, in the interval (0, 1). If
specified, edges are inferred in each iteration until the Vuong test for
edge addition reaches the p-value cutoff or when the maximum
possible number of edges is reached. Leave unspecified if using
|
params |
numeric, Parameters for diffusion model. If left unspecified reasonable parameters are inferred from the data. See details for how to specify parameters for the different distributions. |
quiet |
logical, Should output on progress by suppressed. |
trees |
logical, Should the inferred cascade trees be returned. Note, that this will lead to a different the structure of the function output. See section Value for details. |
Details
The algorithm is describe in detail in Gomez-Rodriguez et al. (2010). Additional information can be found on the netinf website (http://snap.stanford.edu/netinf/).
Exponential distribution:
trans_mod = "exponential"
,params = c(lambda)
. Parametrization:\lambda e^{-\lambda x}
.Rayleigh distribution:
trans_mod = "rayleigh"
,params = c(alpha)
. Parametrization:\frac{x}{\alpha^2} \frac{e^{-x^2}}{2\alpha^2}
.Log-normal distribution:
trans_mod = "log-normal"
,params = c(mu, sigma)
. Parametrization:\frac{1}{x\sigma\sqrt{2\pi}}e^{-\frac{(ln x - \mu)^2}{2\sigma^2}}
.
If higher performance is required and for very large data sets, a faster pure C++ implementation is available in the Stanford Network Analysis Project (SNAP). The software can be downloaded at http://snap.stanford.edu/netinf/.
Value
Returns the inferred diffusion network as an edgelist in an object of
class diffnet
and data.frame
. The first
column contains the sender, the second column the receiver node. The
third column contains the improvement in fit from adding the edge that is
represented by the row. The output additionally has the following
attributes:
-
"diffusion_model"
: The diffusion model used to infer the diffusion network. -
"diffusion_model_parameters"
: The parameters for the model that have been inferred by the approximate profile MLE procedure.
If the argument trees
is set to TRUE
, the output is a list
with the first element being the data.frame
described above, and
the second element being the trees in edge-list form in a single
data.frame
.
References
M. Gomez-Rodriguez, J. Leskovec, A. Krause. Inferring Networks of Diffusion and Influence.The 16th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), 2010.
Examples
# Data already in cascades format:
data(cascades)
out <- netinf(cascades, trans_mod = "exponential", n_edges = 5, params = 1)
# Starting with a dataframe
df <- simulate_rnd_cascades(10, n_nodes = 20)
cascades2 <- as_cascade_long(df, node_names = unique(df$node_name))
out <- netinf(cascades2, trans_mod = "exponential", n_edges = 5, params = 1)