R: Maximum likelihood estimation for settings of error-prone...

icmis {icensmis}

R Documentation

Maximum likelihood estimation for settings of error-prone diagnostic tests and self-reported outcomes

Description

This function estimates the baseline survival function evaluated at each test time in the presence of error-prone diagnostic tests and self-reported outcomes. If there are covariates included in the dataset, it also estimates their coefficients assuming proportional hazards. The covariate values can be either time independent or time varying The function can also be used to incorporate misclassification of disease status at baseline (due to an error-prone diagnostic procedure).

Usage

icmis(
  subject,
  testtime,
  result,
  data,
  sensitivity,
  specificity,
  formula = NULL,
  negpred = 1,
  time.varying = F,
  betai = NULL,
  initsurv = 0.5,
  param = 1,
  ...
)

Arguments

`subject`	variable in data for subject id.
`testtime`	variable in data for test time. Assume all test times are non-negative. testtime = 0 refers to baseline visit (only used/needed if the model is time varying covarites)
`result`	variable in data for test result.
`data`	the data to analyze.
`sensitivity`	the sensitivity of test.
`specificity`	the specificity of test.
`formula`	a formula to specify what covariates to be included in the model. If there is no covariate or one sample setting, set it to NULL. Otherwise, input like ~x1 + x2 + factor(x3).
`negpred`	baseline negative predictive value, i.e. the probability of being truely disease free for those who were tested (reported) as disease free at baseline. If baseline screening test is perfect, then negpred = 1.
`time.varying`	indicator whether fitting a time varying covariate model or not.
`betai`	a vector of initial values for the regression coefficients corresponding to the vector of covariates. If betai=NULL, then 0s are used for the initial values. Otherwise, the length of betai must equal the number of covariates.
`initsurv`	initial value for survival function of baseline group in the last visit time. It is used to compute initival values for survival function at all visit times.
`param`	parameterization for survival function used for optimization, taking values 1, 2, or 3. There are 3 parameterizations available. param = 1: this parameterization uses the change in cumulative incidence in time period j for baseline group as parameters, i.e. log(S[j]) - log(S[j+1]). param = 2: simply use log of the parameters in param = 1 so that those parameters are unbounded. param = 3: the first element is log(-log(S[tau_1])) corresponding to log-log transformation of survival function at first visit, while other parameters are corresponding to the change in log-log of surival function, log(-log(S[j])) - log(-log(S[j-1])). In most cases, all parameters yield same results , while in some situations especially when two visit times are estimated to have same survival functions, they may differ. Choose the one that works best (check likelihood function)
`...`	other arguments passed to `optim` function. For example, if the optimization does not converge, we can increase maxit in the optim function's control argument.

Details

The input data should be in longitudinal form with one row per test time. Use datasim to simulate a dataset to see the sample data structure. If time varying model is to be fitted, the baseline visit must be provided so that the baseline covariate information can be extracted. If an error is generated due to the optimization procedure, then we recommend trying different initial values.

This likelihood-based approach is a function of the survival function evaluated at each unique test time in the dataset and the vector of regression coefficients as model parameters. Therefore, it works best for situations where there is a limited number of unique test times in the dataset. If there are a large number of unique test times, one solution is to group several test times together.

Value

A list of fitting results is returned with log-likelihood, estimated coefficiets, estimated survival function, and estimated covariance matrix for covariates.

Examples

## One sample setting
simdata1 <- datasim(N = 1000, blambda = 0.05, testtimes = 1:8,
 sensitivity = 0.7, specificity = 0.98, betas = NULL, 
 twogroup = NULL, pmiss = 0.3, design = "MCAR")
fit1 <- icmis(subject = ID, testtime = testtime, result = result,
 data = simdata1, sensitivity = 0.7, specificity= 0.98, 
 formula = NULL, negpred = 1)	
             			  
## Two group setting, and the two groups have same sample sizes
simdata2 <- datasim(N = 1000, blambda = 0.05, testtimes = 1:8,
 sensitivity = 0.7, specificity = 0.98, betas = 0.7, 
 twogroup = 0.5, pmiss = 0.3, design = "MCAR")					  
fit2 <- icmis(subject = ID, testtime = testtime, result = result,
 data = simdata2, sensitivity = 0.7, specificity= 0.98,
 formula = ~group)
 
## Three covariates with coefficients 0.5, 0.8, and 1.0
simdata3 <- datasim(N = 1000, blambda = 0.05, testtimes = 1:8, 
sensitivity = 0.7, specificity = 0.98, betas = c(0.5, 0.8, 1.0),
 twogroup = NULL, pmiss = 0.3, design = "MCAR", negpred = 1)					  
fit3 <- icmis(subject = ID, testtime = testtime, result = result,
data = simdata3, sensitivity = 0.7, specificity= 0.98, 
formula = ~cov1+cov2+cov3, negpred = 1)
 
## Fit data with NTFP missing mechanism (the fitting is same as MCAR data)
simdata4 <- datasim(N = 1000, blambda = 0.05, testtimes = 1:8,
 sensitivity = 0.7, specificity = 0.98, betas = c(0.5, 0.8, 1.0),
  twogroup = NULL, pmiss = 0.3, design = "NTFP", negpred = 1)
fit4 <- icmis(subject = ID, testtime = testtime, result = result,
 data = simdata4, sensitivity = 0.7, specificity= 0.98,
 formula = ~cov1+cov2+cov3, negpred = 1)
              					  
## Fit data with baseline misclassification
simdata5 <- datasim(N = 2000, blambda = 0.05, testtimes = 1:8,
 sensitivity = 0.7, specificity = 0.98, betas = c(0.5, 0.8, 1.0),
  twogroup = NULL, pmiss = 0.3, design = "MCAR", negpred = 0.97)
fit5 <- icmis(subject = ID, testtime = testtime, result = result,
 data = simdata5, sensitivity = 0.7, specificity= 0.98,
 formula = ~cov1+cov2+cov3, negpred = 0.97)
  
## Fit data with time varying covariates
simdata6 <- datasim(N = 1000, blambda = 0.05, testtimes = 1:8,
 sensitivity = 0.7, specificity = 0.98, betas = c(0.5, 0.8, 1.0),
 twogroup = NULL, pmiss = 0.3, design = "MCAR", negpred = 1,
 time.varying = TRUE)
fit6 <- icmis(subject = ID, testtime = testtime, result = result,
 data = simdata6, sensitivity = 0.7, specificity= 0.98,
 formula = ~cov1+cov2+cov3, negpred = 1, time.varying = TRUE)

[Package icensmis version 1.5.0 Index]