simulPU {PUlasso}R Documentation

simulated PU data

Description

A simulated data for the illustration. Covariates xix_i are drawn from N(μ,I5×5)N(\mu,I_{5\times 5}) or N(μ,I5×5)N(-\mu,I_{5\times5}) with probability 0.5. To make the first two variables active,μ=[μ1,,μ2,0,0,0]T,θ=[θ0,,θ2,0,0,0]T\mu = [\mu_1,\dots,\mu_2,0,0,0]^T, \theta = [\theta_0,\dots,\theta_2,0,0,0]^T and we set μi=1.5,θiUnif[0.5,1]\mu_i=1.5, \theta_i \sim Unif[0.5,1] Responses yiy_i is simulated via Pθ(y=1x)=1/exp(θTx)P_\theta(y=1|x) = 1/exp(-\theta^Tx). 1000 observations are sampled from the sub-population of positives(y=1) and labeled, and another 1000 observations are sampled from the original population and unlabeled.

Usage

data('simulPU')

Format

A list containing model matrix X, true response y, labeled/unlabeled response vector z, and a true positive probability truePY1.


[Package PUlasso version 3.2.5 Index]