stabsel {bamlss} | R Documentation |

## Stability selection.

### Description

Performs stability selection based on gradient boosting.

### Usage

```
stabsel(formula, data, family = "gaussian",
q, maxit, B = 100, thr = .9, fraction = 0.5, seed = NULL, ...)
## Plot selection frequencies.
## S3 method for class 'stabsel'
plot(x, show = NULL,
pal = function(n) gray.colors(n, start = 0.9, end = 0.3), ...)
```

### Arguments

`formula` |
A formula or extended formula. |

`data` |
A |

`family` |
A |

`q` |
An integer specifying how many terms to select in each boosting run. |

`maxit` |
An integer specifying the maximum number of boosting iterations.
See |

`B` |
An integer. The boosting is run B times. |

`thr` |
Cut-off threshold of relative frequencies (between 0 and 1) for selection. |

`fraction` |
Numeric between 0 and 1. The fraction of data to be used in each boosting run. |

`seed` |
A seed to be set before the stability selection. |

`x` |
A object of class stabsel. |

`show` |
Number of terms to be shown. |

`pal` |
Color palette for different model terms. |

`...` |
Not used yet in |

### Details

`stabsel`

performs stability selection based on gradient
boosting (`opt_boost`

): The boosting algorithm is run
`B`

times on a randomly drawn `fraction`

of the `data`

.
Each boosting run is stopped either when `q`

terms have been selected,
or when `maxit`

iterations have been performed, i.e. either `q`

or `maxit`

can be used to tune the regularization of the boosting.
After the boosting the relative selection frequencies are evaluated.
Terms with a relative selection frequency larger then `thr`

are suggested for a final regression model.

If neither `q`

nor `maxit`

has been specified, `q`

will be set to the square root of the number of columns in `data`

.

Gradient boosting does not depend on random numbers. Thus, the individual boosting runs differ only in the subset of data which is used.

### Value

A object of class stabsel.

### Author(s)

Thorsten Simon

### Examples

```
## Not run: ## Simulate some data.
set.seed(111)
d <- GAMart()
n <- nrow(d)
## Add some noise variables.
for(i in 4:9)
d[[paste0("x",i)]] <- rnorm(n)
f <- paste0("~ ", paste("s(x", 1:9, ")", collapse = "+", sep = ""))
f <- paste(f, "+ te(lon,lat)")
f <- as.formula(f)
f <- list(update(f, num ~ .), f)
## Run stability selection.
sel <- stabsel(f, data = d, q = 6, B = 10)
plot(sel)
## Estimate selected model.
nf <- formula(sel)
b <- bamlss(nf, data = d)
plot(b)
## End(Not run)
```

*bamlss*version 1.2-3 Index]