friedman_data {JOUSBoost}R Documentation

Simulate data from the Friedman model

Description

Simulate draws from a bernoulli distribution over c(-1,1), where the log-odds is defined according to:

log{p(y=1|x)/p(y=-1|x)} = gamma*(1 - x_1 + x_2 - ... + x_6)*(x_1 + x_2 + ... + x_6)

and x is distributed as N(0, I_dxd). See Friedman (2000).

Usage

friedman_data(n = 500, d = 10, gamma = 10)

Arguments

n

Number of points to simulate.

d

The dimension of the predictor variable x.

gamma

A parameter controlling the Bayes error, with higher values of gamma corresponding to lower error rates.

Value

Returns a list with the following components:

y

Vector of simulated response in c(-1,1).

X

An nxd matrix of simulated predictors.

p

The true conditional probability p(y=1|x).

References

Friedman, J., Hastie, T. and Tibshirani, R. (2000). Additive logistic regression: a statistical view of boosting (with discussion), Annals of Statistics 28: 337-307.

Examples

set.seed(111)
dat = friedman_data(n = 500, gamma = 0.5)


[Package JOUSBoost version 2.1.0 Index]