| cv_linear2ph {sleev} | R Documentation | 
Performs cross-validation to calculate the average predicted log likelihood for the linear2ph function. This function can be used to select the B-spline basis that yields the largest average predicted log likelihood.
Description
Performs cross-validation to calculate the average predicted log likelihood for the linear2ph function. This function can be used to select the B-spline basis that yields the largest average predicted log likelihood.
Usage
cv_linear2ph(
  Y_unval = NULL,
  Y = NULL,
  X_unval = NULL,
  X = NULL,
  Z = NULL,
  Bspline = NULL,
  data = NULL,
  nfolds = 5,
  MAX_ITER = 2000,
  TOL = 1e-04,
  verbose = FALSE
)
Arguments
| Y_unval | Specifies the column of the error-prone outcome that is continuous. Subjects with missing values of  | 
| Y | Specifies the column that stores the validated value of  | 
| X_unval | Specifies the columns of the error-prone covariates. Subjects with missing values of  | 
| X | Specifies the columns that store the validated values of  | 
| Z | Specifies the columns of the accurately measured covariates. Subjects with missing values of  | 
| Bspline | Specifies the columns of the B-spline basis. Subjects with missing values of  | 
| data | Specifies the name of the dataset. This argument is required. | 
| nfolds | Specifies the number of cross-validation folds. The default value is  | 
| MAX_ITER | Specifies the maximum number of iterations in the EM algorithm. The default number is  | 
| TOL | Specifies the convergence criterion in the EM algorithm. The default value is  | 
| verbose | If  | 
Value
| avg_pred_loglike | Stores the average predicted log likelihood. | 
| pred_loglike | Stores the predicted log likelihood in each fold. | 
| converge | Stores the convergence status of the EM algorithm in each run. | 
Examples
  rho = 0.3
  p = 0.3
  n = 100
  n2 = 40
  alpha = 0.3
  beta = 0.4
   
  ### generate data
  simX = rnorm(n)
  epsilon = rnorm(n)
  simY = alpha+beta*simX+epsilon
  error = MASS::mvrnorm(n, mu=c(0,0), Sigma=matrix(c(1, rho, rho, 1), nrow=2))
   
  simS = rbinom(n, 1, p)
  simU = simS*error[,2]
  simW = simS*error[,1]
  simY_tilde = simY+simW
  simX_tilde = simX+simU
   
  id_phase2 = sample(n, n2)
   
  simY[-id_phase2] = NA
  simX[-id_phase2] = NA
   
  # cubic basis
  nsieves = c(5, 10)
  pred_loglike = rep(NA, length(nsieves))
  for (i in 1:length(nsieves)) {
      nsieve = nsieves[i]
      Bspline = splines::bs(simX_tilde, df=nsieve, degree=3, 
        Boundary.knots=range(simX_tilde), intercept=TRUE)
      colnames(Bspline) = paste("bs", 1:nsieve, sep="")
      # cubic basis
     
      data = data.frame(Y_tilde=simY_tilde, X_tilde=simX_tilde, Y=simY, X=simX, Bspline)
      ### generate data
     
      res = cv_linear2ph(Y="Y", X="X", Y_unval="Y_tilde", X_unval="X_tilde", 
        Bspline=colnames(Bspline), data=data, nfolds = 5)
      pred_loglike[i] = res$avg_pred_loglik
    }
   
  data.frame(nsieves, pred_loglike)