locw {rchemo}R Documentation

Locally weighted models

Description

locw and locwlv are generic working functions returning predictions of KNN locally weighted (LW) models. One specific (= local) model is fitted for each observation to predict, and a prediction is returned. See the wrapper lwplsr (KNN-LWPLSR) for an example of use.

In KNN-LW models, the prediction is built from two sequential steps, therafter referred to as weighting"1"weighting "1" and weighting"2"weighting "2", respectively. For each new observation to predict, the two steps are as follow:

- Weighting"1"Weighting "1". The kk nearest neighbors (in the training data set) are selected and the prediction model is fitted (in the next step) only on this neighborhood. It is equivalent to give a weight = 1 to the neighbors, and a weight = 0 to the other training observations, which corresponds to a binary weighting.

- Weighting"2"Weighting "2". Each of the kk nearest neighbors eventually receives a weight (different from the usual 1/k1/k) before fitting the model. The weight depend from the dissimilarity (preliminary calculated) between the observation and the neighbor. This corresponds to a within-neighborhood weighting.

The prediction model used in step "2""2" has to be defined in a function specified in argument fun. If there are mm new observations to predict, a list of mm vectors defining the mm neighborhoods has to be provided (argument listnn). Each of the mm vectors contains the indexes of the nearest neighbors in the training set. The mm vectors are not necessary of same length, i.e. the neighborhood size can vary between observations to predict. If there is a weighting in step "2""2", a list of mm vectors of weights have to be provided (argument listw). Then locw fits the model successively for each of the mm neighborhoods, and returns the corresponding mm predictions.

Function locwlv is dedicated to prediction models based on latent variables (LVs) calculations, such as PLSR. It is much faster and recommended.

Usage


locw(Xtrain, Ytrain, X, listnn, listw = NULL, fun, verb = FALSE, ...)

locwlv(Xtrain, Ytrain, X, listnn, listw = NULL, fun, nlv, verb = FALSE, ...)
  

Arguments

Xtrain

Training X-data (n,pn, p).

Ytrain

Training Y-data (n,qn, q).

X

New X-data (m,pm, p) to predict.

listnn

A list of mm vectors defining weighting "1". Component ii of this list is a vector (of length between 1 and nn) of indexes. These indexes define the training observations that are the nearest neighbors of new observation ii. Typically, listnn can be built from getknn, but any other list of length mm can be provided. The mm vectors can have equal length (i.e. the mm neighborhoods are of equal size) or not (the number of neighbors varies between the observations to predict).

listw

A list of mm vectors defining weighting "2". Component ii of this list is a vector (that must have the same length as component ii of listnn) of the weights given to the nearest neighbors when the prediction model is fitted. Internally, weights are "normalized" to sum to 1 in each component. Default to NULL (weights are set to 1/k1 / k where kkis the size of the neihborhodd).

fun

A function corresponding to the prediction model to fit on the mm neighborhoods.

nlv

For locwlv : The number of LVs to calculate.

verb

Logical. If TRUE, fitting information are printed.

...

Optional arguments to pass in function fun.

Value

pred

matrix or list of matrices (if nlv is a vector), with predictions

References

Lesnoff M, Metz M, Roger J-M. Comparison of locally weighted PLS strategies for regression and discrimination on agronomic NIR data. Journal of Chemometrics. 2020;n/a(n/a):e3209. doi:10.1002/cem.3209.

Examples


n <- 50 ; p <- 30
Xtrain <- matrix(rnorm(n * p), ncol = p, byrow = TRUE)
ytrain <- rnorm(n)
Ytrain <- cbind(ytrain, 100 * ytrain)
m <- 4
Xtest <- matrix(rnorm(m * p), ncol = p, byrow = TRUE)
ytest <- rnorm(m)
Ytest <- cbind(ytest, 10 * ytest)

k <- 5
z <- getknn(Xtrain, Xtest, k = k)
listnn <- z$listnn
listd <- z$listd
listnn
listd

listw <- lapply(listd, wdist, h = 2)
listw

nlv <- 2  
locw(Xtrain, Ytrain, Xtest, 
     listnn = listnn, fun = plskern, nlv = nlv)
locw(Xtrain, Ytrain, Xtest, 
     listnn = listnn, listw = listw, fun = plskern, nlv = nlv)

locwlv(Xtrain, Ytrain, Xtest, 
     listnn = listnn, listw = listw, fun = plskern, nlv = nlv)
locwlv(Xtrain, Ytrain, Xtest, 
     listnn = listnn, listw = listw, fun = plskern, nlv = 0:nlv)


[Package rchemo version 0.1-2 Index]