WH.regression.two.components {HistDAWass} | R Documentation |
Multiple regression analysis for histogram variables based on a two component model and L2 Wasserstein distance
Description
The function implements Multiple regression analysis for histogram variables based on a two component model and L2 Wasserstein distance. Taking as imput dependent histogram variable and a set of explanatory histogram variables the methods return a least squares estimation of a two component regression model based on the decomposition of L2 Wasserstein metric for distributional data.
Usage
WH.regression.two.components(data, Yvar, Xvars, simplify = FALSE, qua = 20)
Arguments
data |
A MatH object (a matrix of distributionH). |
Yvar |
An integer, the dependent variable number in data. |
Xvars |
A set of integers the explanantory variables in data. |
simplify |
a logical argument (default=FALSE). If TRUE only few equally spaced quantiles are considered (for speeding up the algorithm) |
qua |
If |
Details
A two component regression model is implemented. The observed variables are histogram variables according to the definition given in the framework of Symbolic Data Analysis and the parameters of the model are estimated using the classic Least Squares method. An appropriate metric is introduced in order to measure the error between the observed and the predicted distributions. In particular, the Wasserstein distance is proposed. Such a metric permits to predict the response variable as direct linear combination of other independent histogram variables.
Value
a named vector with the model estimated parameters
References
Irpino A, Verde R (in press 2015). Linear regression for numeric symbolic variables: a least squares approach
based on Wasserstein Distance. ADVANCES IN DATA ANALYSIS AND CLASSIFICATION, ISSN: 1862-5347, DOI:10.1007/s11634-015-0197-7
An extended version is available on arXiv repository arXiv:1202.1436v2 https://arxiv.org/abs/1202.1436v2
Examples
model.parameters <- WH.regression.two.components(data = BLOOD, Yvar = 1, Xvars = c(2:3))