simEEG {bootSVD} | R Documentation |
Simulation functional EEG data
Description
Our data from (Fisher et al. 2014) consists of EEG measurements from the Sleep Heart Health Study (SHHS) (Quan et al. 1997). Since we cannot publish the EEG recordings from the individuals in the SHHS, we instead include the summary statistics of the PCs from our subsample of the processed SHHS EEG data. This data is used by the simEEG
to simulate functional data that is approximately similar to the data used in our work. The resulting simulated vectors are always of length 900, and are generated from 5 basis vectors (see EEG_leadingV
).
Usage
simEEG(n = 100, centered = TRUE, propVarNoise = 0.45, wide = TRUE)
Arguments
n |
the desired sample size |
centered |
if TRUE, the sample will be centered to have mean zero for each dimension. If FALSE, measurements will be simulated from a population where the mean is equal to that observed in the sample used in (Fisher et al. 2014) (see |
propVarNoise |
the approximate proportion of total sample variance attributable to random noise. |
wide |
if TRUE, the resulting data is outputted as a |
Value
A matrix containing n
simulated measurement vectors of Normalized Delta Power, for the first 7.5 hours of sleep. These vectors are generated according to the equation:
y = \sum_{j=1}^{5} B_j * s_j + e
Where y
is the simulated measurement for a subject, B_j
is the j^{th}
basis vector, s_j
is a random normal variable with mean zero, and e is a vector of random normal noise. The specific values for B_j
and var(s_j)
are determined from the EEG data sample studied in (Fisher et al., 2014), and are respectively equal to the j^{th}
empirical principal component vector (see EEG_leadingV
), and the empirical variance of the j^{th}
score variable (see EEG_score_var
).
References
Aaron Fisher, Brian Caffo, and Vadim Zipunnikov. Fast, Exact Bootstrap Principal Component Analysis for p>1 million. 2014. http://arxiv.org/abs/1405.0922
Stuart F Quan, Barbara V Howard, Conrad Iber, James P Kiley, F Javier Nieto, George T O'Connor, David M Rapoport, Susan Redline, John Robbins, JM Samet, et al. The sleep heart health study: design, rationale, and methods. Sleep, 20(12):1077-1085, 1997. 1.1
Examples
set.seed(0)
#Low noise example, for an illustration of smoother functions
Y<-simEEG(n=20,centered=FALSE,propVarNoise=.02,wide=FALSE)
matplot(Y,type='l',lty=1)
#Higher noise example, for PCA
Y<-simEEG(n=100,centered=TRUE,propVarNoise=.5,wide=TRUE)
svdY<-fastSVD(Y)
V<-svdY$v #since Y is wide, the PCs are the right singular vectors (svd(Y)$v).
d<-svdY$d
head(cumsum(d^2)/sum(d^2),5) #first 5 PCs explain about half the variation
# Compare fitted PCs to true, generating basis vectors
# Since PCs have arbitrary sign, we match the sign of
# the fitted sample PCs to the population PCs first
V_sign_adj<- array(NA,dim=dim(V))
for(i in 1:5){
V_sign_adj[,i]<-V[,i] * sign(crossprod(V[,i],EEG_leadingV[,i]))
}
par(mfrow=c(1,2))
matplot(V_sign_adj[,1:5],type='l',lty=1,
main='PCs from simulated data,\n sign adjusted')
matplot(EEG_leadingV,type='l',lty=1,main='Population PCs')