R: Locally weighted models

locw {rchemo}

R Documentation

Locally weighted models

Description

locw and locwlv are generic working functions returning predictions of KNN locally weighted (LW) models. One specific (= local) model is fitted for each observation to predict, and a prediction is returned. See the wrapper lwplsr (KNN-LWPLSR) for an example of use.

In KNN-LW models, the prediction is built from two sequential steps, therafter referred to as weighting "1" and weighting "2", respectively. For each new observation to predict, the two steps are as follow:

- Weighting "1". The k nearest neighbors (in the training data set) are selected and the prediction model is fitted (in the next step) only on this neighborhood. It is equivalent to give a weight = 1 to the neighbors, and a weight = 0 to the other training observations, which corresponds to a binary weighting.

- Weighting "2". Each of the k nearest neighbors eventually receives a weight (different from the usual 1/k) before fitting the model. The weight depend from the dissimilarity (preliminary calculated) between the observation and the neighbor. This corresponds to a within-neighborhood weighting.

The prediction model used in step "2" has to be defined in a function specified in argument fun. If there are m new observations to predict, a list of m vectors defining the m neighborhoods has to be provided (argument listnn). Each of the m vectors contains the indexes of the nearest neighbors in the training set. The m vectors are not necessary of same length, i.e. the neighborhood size can vary between observations to predict. If there is a weighting in step "2", a list of m vectors of weights have to be provided (argument listw). Then locw fits the model successively for each of the m neighborhoods, and returns the corresponding m predictions.

Function locwlv is dedicated to prediction models based on latent variables (LVs) calculations, such as PLSR. It is much faster and recommended.

Usage


locw(Xtrain, Ytrain, X, listnn, listw = NULL, fun, verb = FALSE, ...)

locwlv(Xtrain, Ytrain, X, listnn, listw = NULL, fun, nlv, verb = FALSE, ...)

Arguments

`Xtrain`	Training X-data (`n, p`).
`Ytrain`	Training Y-data (`n, q`).
`X`	New X-data (`m, p`) to predict.
`listnn`	A list of `m` vectors defining weighting "1". Component `i` of this list is a vector (of length between 1 and `n`) of indexes. These indexes define the training observations that are the nearest neighbors of new observation `i`. Typically, `listnn` can be built from `getknn`, but any other list of length `m` can be provided. The `m` vectors can have equal length (i.e. the `m` neighborhoods are of equal size) or not (the number of neighbors varies between the observations to predict).
`listw`	A list of `m` vectors defining weighting "2". Component `i` of this list is a vector (that must have the same length as component `i` of `listnn`) of the weights given to the nearest neighbors when the prediction model is fitted. Internally, weights are "normalized" to sum to 1 in each component. Default to `NULL` (weights are set to `1 / k` where `k`is the size of the neihborhodd).
`fun`	A function corresponding to the prediction model to fit on the `m` neighborhoods.
`nlv`	For `locwlv` : The number of LVs to calculate.
`verb`	Logical. If `TRUE`, fitting information are printed.
`...`	Optional arguments to pass in function `fun`.

Value

pred

matrix or list of matrices (if nlv is a vector), with predictions

References

Lesnoff M, Metz M, Roger J-M. Comparison of locally weighted PLS strategies for regression and discrimination on agronomic NIR data. Journal of Chemometrics. 2020;n/a(n/a):e3209. doi:10.1002/cem.3209.

Examples


n <- 50 ; p <- 30
Xtrain <- matrix(rnorm(n * p), ncol = p, byrow = TRUE)
ytrain <- rnorm(n)
Ytrain <- cbind(ytrain, 100 * ytrain)
m <- 4
Xtest <- matrix(rnorm(m * p), ncol = p, byrow = TRUE)
ytest <- rnorm(m)
Ytest <- cbind(ytest, 10 * ytest)

k <- 5
z <- getknn(Xtrain, Xtest, k = k)
listnn <- z$listnn
listd <- z$listd
listnn
listd

listw <- lapply(listd, wdist, h = 2)
listw

nlv <- 2  
locw(Xtrain, Ytrain, Xtest, 
     listnn = listnn, fun = plskern, nlv = nlv)
locw(Xtrain, Ytrain, Xtest, 
     listnn = listnn, listw = listw, fun = plskern, nlv = nlv)

locwlv(Xtrain, Ytrain, Xtest, 
     listnn = listnn, listw = listw, fun = plskern, nlv = nlv)
locwlv(Xtrain, Ytrain, Xtest, 
     listnn = listnn, listw = listw, fun = plskern, nlv = 0:nlv)

[Package rchemo version 0.1-2 Index]