stg {Rstg}R Documentation

STG: Feature Selection using Stochastic Gates

Description

STG is a method for feature selection in neural network estimation problems. The new procedure is based on probabilistic relaxation of the l0 norm of features, or the count of the number of selected features. STG simultaneously learns either a nonlinear regression or classification function while selecting a small subset of features, as described in Yamada, et al, ICML 2020.

Usage

stg(
  task_type,
  input_dim,
  output_dim,
  hidden_dims,
  activation = "relu",
  sigma = 0.5,
  lam = 0.1,
  optimizer = "Adam",
  learning_rate = 0.001,
  batch_size = 100L,
  freeze_onward = NULL,
  feature_selection = TRUE,
  weight_decay = 0.001,
  random_state = 123L,
  device = "cpu"
)

Arguments

task_type

string choose 'regression', 'classification', or 'cox'

input_dim

integer The number of features of your data (input dimension)

output_dim

integer The number of classes for 'classification'. Should be 1 for 'regression' and 'cox'

hidden_dims

vector of integers,optional,default:c(60, 20, 3) architecture vector of the neural network

activation

string the type of activation functions.

sigma

float the noise level for the gaussian distribution

lam

float the regularization parameter

optimizer

string choose 'Adam' or 'SGD'

learning_rate

float

batch_size

int

freeze_onward

integer, default:NULL the network parameters will be frozen after 'freeze_onward' epoch. This is to train the gate parameters.

feature_selection

bool

weight_decay

float

random_state

integer

device

string 'cpu' or 'cuda' (if you have GPU)

Value

a "stg" object is returned.

Examples

if (pystg_is_available()){
n_size <- 1000L;
p_size <- 20L;
stg.model <- stg(task_type='regression', input_dim=p_size, output_dim=1L,
hidden_dims = c(500,50, 10), activation='tanh',
optimizer='SGD', learning_rate=0.1, batch_size=n_size,
feature_selection=TRUE, sigma=0.5, lam=0.1, random_state=0.1)
}


[Package Rstg version 0.0.1 Index]