R: Likelihood Distance.

Likedist {influence.SEM}

R Documentation

Likelihood Distance.

Description

A general model-based measure of case influence on model fit is likelihood distance (Cook, 1977, 1986; Cook & Weisberg, 1982) defined as

LD_i=2[L(\hat{\mathbf{\theta}})-L(\hat{\mathbf{\theta}}_{(i)})]

where \hat{\mathbf{\theta}} and \hat{\mathbf{\theta}}_{(i)} are the k \times 1 vectors of estimated model parameters on the original and deleted i samples, respectively, where i = 1, \ldots, N. The subscript (i) indicates that the estimate was computed on the sample excluding case i. L(\hat{\mathbf{\theta}}) and L(\hat{\mathbf{\theta}}_{(i)}) are the log-likelihoods based on the original and the deleted i samples, respectively.

Usage

Likedist(model, data, ...)

Arguments

`model`	A description of the user-specified model using the lavaan model syntax. See `lavaan` for more information.
`data`	A data frame containing the observed variables used in the model. If any variables are declared as ordered factors, this function will treat them as ordinal variables.
`...`	Additional parameters for `sem` function.

Details

The log-likelihoods L(\hat{\mathbf{\theta}}) and L(\hat{\mathbf{\theta}}_{(i)}) are computed by the function bollen.loglik using the formula 4B2 described by Bollen (1989, pag. 135).

The likelihood distance gives the amount by which the log-likelihood of the full data changes if one were to evaluate it at the reduced-data estimates. The important point is that L(\hat{\mathbf{\theta}}_{(i)}) is not the log-likelihood obtained by fitting the model to the reduced data set. It is obtained by evaluating the likelihood function based on the full data set (containing all n observations) at the reduced-data estimates (Schabenberger, 2005).

Value

Returns a vector of LD_i.

Note

If for observation i model does not converge or yelds a solution with negative estimated variances, the associated value of LD_i is set to NA.

Author(s)

Massimiliano Pastore, Gianmarco Altoe'

References

Bollen, K.A. (1989). Structural Equations with latent Variables. New York, NY: Wiley.

Cook, R.D. (1977). Detection of influential observations in linear regression. Technometrics, 19, 15-18.

Cook, R.D. (1986). Assessment of local influence. Journal of the Royal Statistical Society B, 48, 133-169.

Cook, R.D., Weisberg, S. (1986). Residuals and influence in regressions. New York, NY: Chapman & Hall.

Pek, J., MacCallum, R.C. (2011). Sensitivity Analysis in Structural Equation Models: Cases and Their Influence. Multivariate Behavioral Research, 46, 202-228.

Schabenberger, O. (2005). Mixed model influence diagnostics. In SUGI, 29, 189-29. SAS institute Inc, Cary, NC.

Examples

## not run: this example take several minutes
data("PDII")
model <- "
  F1 =~ y1+y2+y3+y4
"
# fit0 <- sem(model, data=PDII)
# LD <-Likedist(model,data=PDII)
# plot(LD,pch=19,xlab="observations",ylab="Likelihood distances")

## not run: this example take several minutes
## an example in which the deletion of a case yelds a solution 
## with negative estimated variances
model <- "
  F1 =~ x1+x2+x3
  F2 =~ y1+y2+y3+y4
  F3 =~ y5+y6+y7+y8
"

# fit0 <- sem(model, data=PDII)
# LD <-Likedist(model,data=PDII)
# plot(LD,pch=19,xlab="observations",ylab="Likelihood distances")

[Package influence.SEM version 2.3 Index]