penmodel {FamEvent}R Documentation

Fit a penetrance model

Description

Fits a penetrance model for family data based on a prospective likelihood with ascertainment correction and provides model parameter estimates.

Usage

penmodel(formula, cluster = "famID", gvar = "mgene", parms, cuts = NULL, data, 
design = "pop", base.dist = "Weibull", frailty.dist = "none",
agemin = NULL, robust = FALSE)

Arguments

formula

A formula expression as for other regression models. The response should be a survival object as returned by the Surv function. See the documentation for Surv, lm and formula for details.

cluster

Name of cluster variable. Default is "famID".

gvar

Name of genetic variable. Default is "mgene".

parms

Vector of initial values for the parameters in the model including baseline parameters and regression coefficients. parms = c(baseparm, coef), where baseparm includes the initial values for baseline parameters used for base.dist, and coef includes the initial values for regression coefficients for the variables specified in formula. If frailty.dist is specified, the initial value of the frailty parameter should be specified parms = c(baseparm, coef, k), where k the initial value for the frailty parameter. See details for the baseline parameters.

cuts

Vector of cut points that define the intervals where the hazard function is constant. The cuts should be specified when base.dist="piecewise" and must be strictly positive and finite. Default is NULL.

data

Data frame generated from simfam or data frame containing variables named in the formula and specific variables: famID, indID, gender, currentage, mgene, time, status and weight with attr(data,"agemin") specified.

design

Study design of the family data. Possible choices are: "pop", "pop+", "cli", "cli+" or "twostage", where "pop" is for the population-based design with affected probands whose mutation status can be either carrier or non-carrier, "pop+" is similar to "pop" but with mutation carrier probands, "cli" is for the clinic-based design that includes affected probands with at least one parent and one sib affected, "cli+" is similar to "cli" but with mutation carrier probands, and "twostage" is for the two-stage design with oversampling of high risks families. Default is "pop".

base.dist

Choice of baseline hazard distributions to fit. Possible choices are: "Weibull", "loglogistic", "Gompertz", "lognormal", "gamma", "logBurr", or "piecewise". Default is "Weibull".

frailty.dist

Choice of frailty distribution. Possible choices are: "gamma", "lognormal", or "none". Default is "none".

agemin

Minimum age of disease onset or minimum age. Default is NULL.

robust

Logical; if TRUE, the robust ‘sandwich’ standard errors and variance-covariance matrix are provided, otherwise the conventional standard errors and variance-covariance matrix are provided.

Details

When frailty.dist = "none", the following penetrance model is fitted to family data with a specified baseline hazard distribution h(t|xs, xg) = h0(t - t0) exp(βs * xs + βg * xg), where h0(t) is the baseline hazards function specified by base.dist, which depends on the shape and scale parameters, λ and ρ; xs indicates male (1) and female (0) and xg indicates carrier (1) or non-carrier (0) of a gene of interest (major gene). Additional covariates can be added to formula in the model.

When frailty.dist is specified as either "gamma" or "lognormal", the follwoing shared frailty model is fitted to family data h(t|X,Z) = h0(t - t0) Z exp(βs * xs + βg * xg), where h0(t) is the baseline hazard function, t0 is a minimum age of disease onset, and Z represents a frailty shared within families whose distribution is specified by frailty.dist.

Choice of frailty distributions

frailty.dist = "gamma" assumes Z follows Gamma(k, 1/k).

frailty.dist = "lognormal" assumes Z follows log-normal distribution with mean 0 and variance 1/k.

frailty.dist = "none" shares no frailties within families and assumes independence among family members.

For family data arising from population- or clinic-based study designs (design="pop", "pop+", "cli", or "cli+"), the parameters of the penetrance model are estimated using the ascertainment-corrected prospective likelihood approach (Choi, Kopciuk and Briollais, 2008).

For family data arising from a two-stage study design (design="twostage"), model parameters are estimated using the composite likelihood approach (Choi and Briollais, 2011)

Note that the baseline parameters include lambda and rho, which represent the scale and shape parameters, respectively, and eta, additional parameter to specify for "logBurr" distribution. For the "lognormal" baseline distribution, lambda and rho represent the location and scale parameters for the normally distributed logarithm, where lambda can take any real values and rho > 0. For the other baselinse distributions, lambda > 0, rho > 0, and eta > 0. When a piecewise constant distribution is specified for the baseline hazards, base.dist="piecewise", baseparm should specify the initial interval-constant values, one more than the cut points specified bycuts.

Transformed baseline parameters are used for estimation; log transformation is applied to both scale and shape parameters (\lambda, \rho) for "Weibull", "loglogistic", "Gompertz" and "gamma" baselines, to (\lambda, \rho, \eta) for "logBurr" and to the piecewise constant parameters for a piecewise baseline hazard. For "lognormal" baseline distribution, the log transformation is applied only to \rho, not to \lambda, which represents the location parameter for the normally distributed logarithm.

Calculations of penetrance estimates and their standard errors and 95% confidence intervals at given ages can be obtained by penetrance function via Monte-Carlo simulations of the estimated penetrance model.

Value

Returns an object of class 'penmodel', including the following elements:

estimates

Parameter estimates of transformed baseline parameters and regression coefficients.

varcov

Variance-covariance matrix of parameter estimates obtained from the inverse of Hessian matrix.

varcov.robust

Robust ‘sandwich’ variance-covariance matrix of parameter estimates when robust=TRUE.

se

Standard errors of parameter estimates obtained from the inverse of Hessian matrix.

se.robust

Robust ‘sandwich’ standard errors of parameter estimates when robust=TRUE.

logLik

Loglikelihood value for the fitted penetrance model.

AIC

Akaike information criterion (AIC) value of the model; AIC = 2*k - 2*logLik, where k is the number of parameters used in the model.

Author(s)

Yun-Hee Choi

References

Choi, Y.-H., Briollais, L., He, W. and Kopciuk, K. (2021) FamEvent: An R Package for Generating and Modeling Time-to-Event Data in Family Designs, Journal of Statistical Software 97 (7), 1-30. doi:10.18637/jss.v097.i07

Choi, Y.-H., Kopciuk, K. and Briollais, L. (2008) Estimating Disease Risk Associated Mutated Genes in Family-Based Designs, Human Heredity 66, 238-251.

Choi, Y.-H. and Briollais (2011) An EM Composite Likelihood Approach for Multistage Sampling of Family Data with Missing Genetic Covariates, Statistica Sinica 21, 231-253.

See Also

penmodelEM, simfam, penplot, print.penmodel, summary.penmodel, print.summary.penmodel, plot.penmodel

Examples

# Family data simulated from population-based design using a Weibull baseline hazard 

set.seed(4321)
fam <- simfam(N.fam = 200, design = "pop+", variation = "none", base.dist = "Weibull", 
       base.parms = c(0.01, 3), vbeta = c(-1.13, 2.35), agemin = 20, allelefreq = 0.02)
 
# Penetrance model fit for simulated family data

fit <- penmodel(Surv(time, status) ~ gender + mgene, cluster = "famID", design = "pop+",
       parms = c(0.01, 3, -1.13, 2.35), data = fam, base.dist = "Weibull")

# Summary of the model parameter estimates from the model fit

summary(fit)

# Plot the lifetime penetrance curves with 95% CIs from the model fit for specific  
# gender and mutation status groups along with their nonparametric penetrance curves  
# based on data excluding probands. 

plot(fit, add.KM = TRUE, conf.int = TRUE, MC = 100)

[Package FamEvent version 3.1 Index]