Estimate the unknown parameters of the admixture model(s) under study

Description

Estimate the component weights, the location shift parameter (in case of a symmetric unknown component density), and the unknown component distribution using different estimation techniques. We remind that the i-th admixture model has probability density function (pdf) l_i such that: l_i = p_i * f_i + (1-p_i) * g_i, where g_i is the known component density. The unknown quantities p_i and f_i then have to be estimated.

Usage

samples = NULL,
sym.f = FALSE,
est.method = c("PS", "BVdk", "IBM"),
comp.dist = NULL,
comp.param = NULL
)

Arguments

 samples A list of the K samples to be studied, all following admixture distributions. sym.f A boolean indicating whether the unknown component densities are assumed to be symmetric or not. est.method The estimation method to be applied. Can be one of 'BVdk' (Bordes and Vandekerkhove estimator), 'PS' (Patra and Sen estimator), or 'IBM' (Inversion Best-Matching approach). The same estimation method is performed on each sample. Important note: estimation by 'IBM' is unbiased only under H0, meaning that choosing this method requires to perform previously the test hypothesis between the pairs of samples. For further details, see section 'Details' below. comp.dist A list with 2*K elements corresponding to the component distributions (specified with R native names for these distributions) involved in the K admixture models. Elements, grouped by 2, refer to the unknown and known components of each admixture model, If there are unknown elements, they must be specified as 'NULL' objects. For instance, 'comp.dist' could be specified as follows with K = 3: list(f1 = NULL, g1 = 'norm', f2 = NULL, g2 = 'norm', f3 = NULL, g3 = 'rnorm'). comp.param A list with 2*K elements corresponding to the parameters of the component distributions, each element being a list itself. The names used in this list must correspond to the native R argument names for these distributions. Elements, grouped by 2, refer to the parameters of unknown and known components of each admixture model. If there are unknown elements, they must be specified as 'NULL' objects. For instance, 'comp.param' could be specified as follows (with K = 3): list(f1 = NULL, g1 = list(mean=0,sd=1), f2 = NULL, g2 = list(mean=3,sd=1.1), f3 = NULL, g3 = list(mean=-2,sd=0.6)).

Details

For further details on the different estimation techniques, see i) IBM approach at https://hal.archives-ouvertes.fr/hal-03201760 ; ii) Patra and Sen estimator: Patra, R.K. and Sen, B. (2016); Estimation of a Two-component Mixture Model with Applications to Multiple Testing; JRSS Series B, 78, pp. 869–893. ; iii) BVdk estimator: Bordes, L. and Vandekerkhove, P. (2010); Semiparametric two-component mixture model when a component is known: an asymptotically normal estimator; Math. Meth. Stat.; 19, pp. 22–41.

Value

A list containing the estimated weight of every unknown component distribution among admixture samples.

Author(s)

Xavier Milhaud xavier.milhaud.research@gmail.com

Examples

##### On a simulated example to see whether the true parameters are well estimated.
list.comp <- list(f1 = "norm", g1 = "norm",
f2 = "norm", g2 = "norm")
list.param <- list(f1 = list(mean = 0, sd = 1), g1 = list(mean = 2, sd = 0.7),
f2 = list(mean = 0, sd = 1), g2 = list(mean = -3, sd = 1.1))
## Simulate data:
sim1 <- rsimmix(n = 2100, unknownComp_weight = 0.8, comp.dist = list(list.comp\$f1,list.comp\$g1),
comp.param = list(list.param\$f1, list.param\$g1))\$mixt.data
sim2 <- rsimmix(n= 2000, unknownComp_weight = 0.85, comp.dist = list(list.comp\$f2,list.comp\$g2),
comp.param = list(list.param\$f2, list.param\$g2))\$mixt.data
## Estimate the mixture weights of the admixture models:
list.comp <- list(f1 = NULL, g1 = "norm",
f2 = NULL, g2 = "norm")
list.param <- list(f1 = NULL, g1 = list(mean = 2, sd = 0.7),
f2 = NULL, g2 = list(mean = -3, sd = 1.1))
estim <- admix_estim(samples = list(sim1,sim2), sym.f = TRUE, est.method = 'IBM',
comp.dist = list.comp, comp.param = list.param)
estim