g2l.proc {LPRelevance} | R Documentation |
Procedures for global and local inference.
Description
This function performs customized fdr analyses tailored to each individual cases.
Usage
g2l.proc(X, z, X.target = NULL, z.target = NULL, m = c(4, 6), alpha = 0.1,
nbag = NULL, nsample = length(z), lp.reg.method = "lm",
null.scale = "QQ", approx.method = "direct", ngrid = 2000,
centering = TRUE, coef.smooth = "BIC", fdr.method = "locfdr",
plot = TRUE, rel.null = "custom", locfdr.df = 10,
fdr.th.fixed = NULL, parallel = FALSE, ...)
Arguments
X |
A |
z |
A length |
X.target |
A |
z.target |
A vector of length |
m |
An ordered pair. First number indicates how many LP-nonparametric basis to construct for each |
alpha |
Confidence level for determining signals. |
nbag |
Number of bags of parametric bootstrapped samples to use for each target case, each time a new set of relevance samples will be generated for analysis, and the resulting fdr curves are aggregated together by taking the mean values. Set to |
nsample |
Number of relevance samples generated for each case. The default is the size of the input z-statistic. |
lp.reg.method |
Method for estimating the relevance function and its conditional LP-Fourier coefficients. We currently support three options: lm (inbuilt with subset selection), glmnet, and knn. |
null.scale |
Method of estimating null standard deviation from the laser samples. Available options: "IQR", "QQ" and "locfdr" |
approx.method |
Method used to approximate customized fdr curve, default is "direct".When set to "indirect", the customized fdr is computed by modifying pooled fdr using relevant density function. |
ngrid |
Number of gridpoints to use for computing customized fdr curve. |
centering |
Whether to perform regression-adjustment to center the data, default is TRUE. |
coef.smooth |
Specifies the method to use for LP coefficient smoothing (AIC or BIC). Uses BIC by default. |
fdr.method |
Method for controlling false discoveries (either "locfdr" or "BH"), default choice is "locfdr". |
plot |
Whether to include plots in the results, default is |
rel.null |
How the relevant null changes with x: "custom" denotes we allow it to vary with x, and "th" denotes fixed. |
locfdr.df |
Degrees of freedom to use for |
fdr.th.fixed |
Use fixed fdr threshold for finding signals. Default set to |
parallel |
Use parallel computing for obtaining the relevance samples, mainly used for very huge |
... |
Extra parameters to pass to other functions. Currently only supports the arguments for |
Value
A list containing the following items:
macro |
Available when |
$result |
A list of global inference results: |
$X |
Matrix of covariates, same as input |
$z |
Vector of observations, same as input |
$probnull |
A vector of length |
$signal |
A binary vector of length |
plots |
A list of plots for global inference: |
$signal_x |
A plot of signals discovered, marked in red |
$dps_xz |
A scatterplot of z on x, colored based on the discovery propensity scores, only available when |
$dps_x |
A scatterplot of discovery propensity scores on x, only available when |
micro |
Available when |
$result |
Customized estimates for null probabilities for target |
$result$signal |
A binary vector of length |
$global |
Pooled global estimates for null probabilities for target |
$plots |
Customized fdr plots for the target cases. |
m.lp |
Same as input |
Author(s)
Subhadeep Mukhopadhyay, Kaijun Wang
Maintainer: Kaijun Wang <kaijunwang.19@gmail.com>
References
Mukhopadhyay, S., and Wang, K (2021) "On The Problem of Relevance in Statistical Inference". <arXiv:2004.09588>
Examples
data(funnel)
X<-funnel$x
z<-funnel$z
##macro-inference using locfdr and LASER:
g2l_macro<-g2l.proc(X,z)
g2l_macro$macro$plots
#Microinference for the DTI data: case A with x=(18,55) and z=3.95
data(data.dti)
X<- cbind(data.dti$coordx,data.dti$coordy)
z<-data.dti$z
g2l_x<-g2l.proc(X,z,X.target=c(18,55),z.target=3.95,nsample =3000)
g2l_x$micro$plots$fdr.1+ggplot2::coord_cartesian(xlim=c(0,4))
g2l_x$micro$result[4]