PASS_Proj_Test_ufDA {fPASS} | R Documentation |
Power and Sample size (PASS) calculation of Two-Sample Projection-based test for sparsely observed univariate functional data.
Description
The function PASS_Proj_Test_ufDA()
computes the power and sample size (PASS) required to conduct
the projection-based test of mean function between two groups of longitudinal data
or sparsely observed functional data under a random irregular design, under
common covariance structure between the groups. See Wang (2021) for more details
of the testing procedure.
Usage
PASS_Proj_Test_ufDA(
sample_size,
target.power,
sig.level = 0.05,
nobs_per_subj,
obs.design,
mean_diff_fnm,
cov.type = c("ST", "NS"),
cov.par,
sigma2.e,
missing_type = c("nomiss", "constant"),
missing_percent = 0,
eval_SS = 5000,
alloc.ratio = c(1, 1),
fpca_method = c("fpca.sc", "face"),
mean_diff_add_args = list(),
fpca_optns = list(pve = 0.95),
nWgrid = 201,
npc_to_use = NULL,
return.eigencomp = FALSE,
nsim = 10000
)
Arguments
sample_size |
Total sample size combining both the groups, must be a positive integer. |
target.power |
Target power to achieve, must be a number between 0 and 1.
Only one of |
sig.level |
Significance level of the test, default set at 0.05, must be less than 0.2. |
nobs_per_subj |
The number of observations per subject. Each element of it must be greater than 3. It could also be a vector to indicate that the number of observation for each is randomly varying between the elements of the vector, or a scalar to ensure that the number of observations are same for each subject. See examples. |
obs.design |
The sampling design of the observations. Must be provided as
a list with the following elements. If the design is longitudinal (e.g. a clinical trial
where there is pre-specified schedule of visit for the participants) it must be
a named list with elements |
mean_diff_fnm |
The name of the function that output of the difference of the mean between the
two groups at any given time. It must be supplied as character, so that |
cov.type |
The type of the covariance structure of the data, must be either of 'ST' (stationary) or
'NS' (non-stationary). This argument along with the |
cov.par |
The covariance structure of the latent response trajectory.
If |
sigma2.e |
Measurement error variance, should be set as zero or a very small number if the measurement error is not significant. |
missing_type |
The type of missing in the number of observations of the subjects. Can be one of
|
missing_percent |
The percentage of missing at each observation points for each subject.
Must be supplied as number between [0, 0.8], as missing percentage more than 80% is not practical.
If |
eval_SS |
The sample size based on which the eigencomponents will be estimated from data. To compute the theoretical power of the test we must make sure that we use a large enough sample size to generate the data such that the estimated eigenfunctions are very close to the true eigenfunctions and that the sampling design will not have much effect on the loss of precision. Default value 5000. |
alloc.ratio |
The allocation ratio of samples in the each group. Note that the eigenfunctions
will still be estimated based on the total sample_size, however, the variance
of the |
fpca_method |
The method by which the FPCA is computed. Must be one of
'fpca.sc' and 'face'. If |
mean_diff_add_args |
Additional arguments to be passed to group difference
function specified in the argument |
fpca_optns |
Additional options to be passed onto either of |
nWgrid |
The length of the working grid based in the domain of the function on which
the eigenfunctions will be estimated. The actual working grid will be calculated using
the |
npc_to_use |
Number of eigenfunctions to use to compute the power. Default is NULL, in which case all the eigenfunctions estimated from the data will be used. |
return.eigencomp |
Indicates whether to return the eigencomponents obtained from the fPCA
on the large data with sample size equal to |
nsim |
The number of samples to be generated from the alternate distribution of Hotelling T statistic. Default value is 10000. |
Details
The function is designed to perform the power and sample size analysis for functional under a dense and sparse (random) design and longitudinal data. The function can handle data from wide variety of covariance structure, can be parametric, or non-parametric. Additional with traditional stationary structures assumed for longitudinal data (see nlme::corClasses), the user can specify any other non-stationary covariance function in the form of either a covariance function or in terms of eigenfunctions and eigenvalues. The user have a lot of flexibility into tweaking the arguments of the function to assess the power function of the test under different sampling design and covariance process of the response trajectory, and for any arbitrary mean difference function. Overall, the functionality of the module is quite comprehensive and includes all the different cases considered in the 'NCSS PASS (2023)' software. We believe that this software can be an effective clinical trial design tools when considering the projection-based test as the primary decision making method.
Value
A list with following elements, power_value
if is.null(target.power)
then returns the power of the test when n equal to sample_size
, otherwise required_SS
,
the sample size required to achieve the power of the test at target.power
.
If return.eigencomp == TRUE
then est_eigencomp
is also returned, containing
the entire output obtained from internal call of Extract_Eigencomp_fDA()
.
Specification of key arguments
If obs.design$design == 'functional'
then a dense grid of length,
specified by ngrid (typically 101/201) is internally created, and
the observation points will be randomly chosen from them.
The time points could also randomly chosen between
any number between the interval, but then for large number of subject,
fpca_sc()
function will take huge
time to estimate the eigenfunction. For dense design, the user must set
a large value of the argument nobs_per_subj
and for sparse (random) design,
nobs_per_subj
should be set small (and varying).
On the other hand, typical to longitudinal data, if the measurements are
taken at fixed time points (from baseline)
for each subject, then the user must set obs.design$design == 'longitudinal'
and
the time points must be accordingly specified
in the argument obs.design$visit.schedule
. The length of obs.design$visit.schedule
must match length(nobs_per_subj)-1
. Internally, when
obs.design$design == 'longitudinal'
, the function scale the visit times
so that it lies between [0, 1], so the user should not
specify any element named fun.domain
in the
list for obs.design$design == 'longitudinal'
. Make sure that
the mean function and the covariance function specified
in the cov.par
and mean_diff_fnm
parameter also scaled to
take argument between [0, 1]. Also, it is imperative to say that nobs_per_subj
must
be of a scalar positive integer for design == 'longitudinal'
.
Author(s)
Salil Koner
Maintainer: Salil Koner
salil.koner@duke.edu
References
Wang, Qiyao (2021)
Two-sample inference for sparse functional data, Electronic Journal of Statistics,
Vol. 15, 1395-1423
doi:10.1214/21-EJS1802.
PASS 2023 Power Analysis and Sample Size Software (2023). NCSS, LLC. Kaysville, Utah, USA, ncss.com/software/pass.
See Also
See Power_Proj_Test_ufDA()
and Extract_Eigencomp_fDA()
.
Examples
# Example 1: Power analysis for stationary exponential covariance.
# Should return a power same as the size because
# the true mean difference is zero.
set.seed(12345)
mean.diff <- function(t) {0*t};
obs.design = list("design" = "longitudinal",
"visit.schedule" = seq(0.1, 0.9, length.out=7),
"visit.window" = 0.05)
cor.str <- nlme::corExp(1, form = ~ time | Subject);
sigma2 <- 1; sigma2.e <- 0.25; nobs_per_subj <- 8;
missing_type <- "constant"; missing_percent <- 0.01;
# Please increase `eval_SS` argument from 1000 to 5000 to get
# accurate precision on the estimated eigenfunctions.
pow <- PASS_Proj_Test_ufDA(sample_size = 100, target.power = NULL, sig.level = 0.05,
obs.design = obs.design,
mean_diff_fnm = "mean.diff", cov.type = "ST",
cov.par = list("var" = sigma2, "cor" = cor.str),
sigma2.e = sigma2.e, nobs_per_subj = nobs_per_subj,
missing_type = missing_type,
missing_percent = missing_percent, eval_SS = 1000,
alloc.ratio = c(1,1), nWgrid = 201,
fpca_method = "fpca.sc",
mean_diff_add_args = list(), fpca_optns = list("pve" = 0.95),
nsim = 1e3)
print(pow$power_value)
# Example 2: Sample size calculation for a non-stationary covariance:
alloc.ratio <- c(1,1)
mean.diff <- function(t) {3 * (t^3)};
eig.fun <- function(t, k) {
if (k==1) ef <- sqrt(2)*sin(2*pi*t)
else if (k==2) ef <- sqrt(2)*cos(2*pi*t)
return(ef)}
eig.fun.vec <- function(t){cbind(eig.fun(t, 1),eig.fun(t, 2))}
eigen.comp <- list("eig.val" = c(1, 0.5), "eig.obj" = eig.fun.vec)
obs.design <- list(design = "functional", fun.domain = c(0,1))
cov.par <- list("cov.obj" = NULL, "eigen.comp" = eigen.comp)
sigma2.e <- 0.001; nobs_per_subj <- 4:7;
missing_type <- "nomiss"; missing_percent <- 0;
fpca_method <- "fpca.sc"
# Please increase `eval_SS` argument from 1000 to 5000 to get
# accurate precision on the estimated eigenfunctions.
pow <- PASS_Proj_Test_ufDA(sample_size = NULL, target.power = 0.8,
sig.level = 0.05, obs.design = obs.design,
mean_diff_fnm = "mean.diff", cov.type = "NS",
cov.par = cov.par, sigma2.e = sigma2.e,
nobs_per_subj = nobs_per_subj, missing_type = missing_type,
missing_percent = missing_percent, eval_SS = 1000,
alloc.ratio = alloc.ratio, fpca_method = "fpca.sc",
mean_diff_add_args = list(), fpca_optns = list(pve = 0.95),
nsim = 1e3, nWgrid = 201)
print(pow$required_SS)