kneigh.condquant {condmixt} R Documentation

## Conditional quantile estimation from nearest neighbors.

### Description

Conditional quantile estimation from k-nearest neighbors in the explanatory variable space.

### Usage

kneigh.condquant(x, y, k = 10, p = 0.9)


### Arguments

 x Matrix of explanatory (independent) variables of dimension d x n, d is the number of variables and n is the number of examples (patterns) y Vector of n dependent variables k Number of neighbors, default is 10. p Probability level, default is 0.99.

### Details

For each example j (each column) in the matrix x, its k nearest neighbors in terms of Euclidean distance are identified. Let j1,..., jk be the k nearest neighbors. Then, the conditional quantile is estimated by computing the sample quantile over y[j1],...,y[jk].

### Value

A vector of quantile of length n.

Julie Carreau

### References

Bishop, C. (1995), Neural Networks for Pattern Recognition, Oxford

quantile

### Examples

# generate train data
ntrain <- 500
xtrain <- runif(ntrain)
ytrain <- rfrechet(ntrain,loc = 3*xtrain+1,scale =
0.5*xtrain+0.001,shape=xtrain+1)
plot(xtrain,ytrain,pch=22) # plot train data
qgen <- qfrechet(0.99,loc = 3*xtrain+1,scale =
0.5*xtrain+0.001,shape=xtrain+1) # compute quantile from generative model
points(xtrain,qgen,pch=".",col="orange")

kquant <- kneigh.condquant(t(xtrain),ytrain,p=0.99) # compute estimated quantile

points(xtrain,kquant,pch="o",col="blue")
# sample quantiles are not good in the presence of heavy-tailed data

ytrain <- rlnorm(ntrain,meanlog = 3*xtrain+1,sdlog =
0.5*xtrain+0.001)
dev.new()
plot(xtrain,ytrain,pch=22) # plot train data
qgen <- qlnorm(0.99,meanlog = 3*xtrain+1,sdlog =
0.5*xtrain+0.001) # compute quantile from generative model
points(xtrain,qgen,pch=".",col="orange")
# compute estimated quantile
kquant <- kneigh.condquant(t(xtrain),ytrain,p=0.99)

points(xtrain,kquant,pch="o",col="blue") # a bit more reasonable



[Package condmixt version 1.1 Index]