simulPU {PUlasso} | R Documentation |
simulated PU data
Description
A simulated data for the illustration.
Covariates x_i
are drawn from N(\mu,I_{5\times 5})
or N(-\mu,I_{5\times5})
with probability 0.5.
To make the first two variables active,\mu = [\mu_1,\dots,\mu_2,0,0,0]^T, \theta = [\theta_0,\dots,\theta_2,0,0,0]^T
and we set \mu_i=1.5, \theta_i \sim Unif[0.5,1]
Responses y_i
is simulated via P_\theta(y=1|x) = 1/exp(-\theta^Tx)
.
1000 observations are sampled from the sub-population of positives(y=1) and labeled, and another 1000 observations are sampled from the original population and unlabeled.
Usage
data('simulPU')
Format
A list containing model matrix X, true response y, labeled/unlabeled response vector z, and a true positive probability truePY1.
[Package PUlasso version 3.2.5 Index]