| simulPU {PUlasso} | R Documentation |
simulated PU data
Description
A simulated data for the illustration.
Covariates x_i are drawn from N(\mu,I_{5\times 5}) or N(-\mu,I_{5\times5}) with probability 0.5.
To make the first two variables active,\mu = [\mu_1,\dots,\mu_2,0,0,0]^T, \theta = [\theta_0,\dots,\theta_2,0,0,0]^T
and we set \mu_i=1.5, \theta_i \sim Unif[0.5,1]
Responses y_i is simulated via P_\theta(y=1|x) = 1/exp(-\theta^Tx).
1000 observations are sampled from the sub-population of positives(y=1) and labeled, and another 1000 observations are sampled from the original population and unlabeled.
Usage
data('simulPU')
Format
A list containing model matrix X, true response y, labeled/unlabeled response vector z, and a true positive probability truePY1.
[Package PUlasso version 3.2.5 Index]