dbstats-package {dbstats} | R Documentation |
This package contains functions for distance-based prediction methods.
These are methods for prediction where predictor information is coded as a matrix of distances between individuals.
In the currently implemented methods the response is a univariate variable as in the ordinary linear model or in the generalized linear model.
Distances can either be directly input as an distances matrix,
a squared distances matrix, an inner-products matrix
(see GtoD2
) or computed from observed
explanatory variables.
Notation convention: in distance-based methods we must distinguish observed explanatory variables which we denote by Z or z, from Euclidean coordinates which we denote by X or x. For explanation on the meaning of both terms see the bibliography references below.
Observed explanatory variables z are possibly a mixture of continuous and qualitative explanatory variables or more general quantities.
dbstats does not provide specific functions for computing distances, depending instead on other functions and packages, such as:
dist
in the stats package.
dist
in the proxy package. When the
proxy package is loaded, its dist
function
supersedes the one in the stats package.
daisy
in the cluster package.
Compared to both instances of dist
above whose input must be
numeric variables, the main feature of daisy
is
its ability to handle other variable types as well (e.g. nominal, ordinal,
(a)symmetric binary) even when different types occur in the same data set.
Actually the last statement is not hundred percent true: it refers only to
the default behaviour of both dist
functions, whereas the
dist
function in the proxy package can
evaluate distances between observations with a user-provided function,
entered as a parameter, hence it can deal with any type of data. See the
examples in pr_DB
.
Functions of dbstats package:
Linear and local linear models with a continuous response:
dblm
for distance-based linear models.
ldblm
for local distance-based linear models.
dbplsr
for distance-based partial least squares.
Generalized linear and local generalized linear models with a numeric response:
dbglm
for distance-based generalized linear models.
ldbglm
for local distance-based generalized linear models.
Package: | dbstats |
Type: | Package |
Version: | 2.0.1 |
Date: | 2022-12-07 |
License: | GPL-2 |
LazyLoad: | yes |
Boj, Eva <evaboj@ub.edu>, Caballe, Adria <adria.caballe@upc.edu>, Delicado, Pedro <pedro.delicado@upc.edu> and Fortiana, Josep <fortiana@ub.edu>
Boj E, Caballe, A., Delicado P, Esteve, A., Fortiana J (2016). Global and local distance-based generalized linear models. TEST 25, 170-195.
Boj E, Delicado P, Fortiana J (2010). Distance-based local linear regression for functional predictors. Computational Statistics and Data Analysis 54, 429-437.
Boj E, Grane A, Fortiana J, Claramunt MM (2007). Implementing PLS for distance-based regression: computational issues. Computational Statistics 22, 237-248.
Boj E, Grane A, Fortiana J, Claramunt MM (2007). Selection of predictors in distance-based regression. Communications in Statistics B - Simulation and Computation 36, 87-98.
Cuadras CM, Arenas C, Fortiana J (1996). Some computational aspects of a distance-based model for prediction. Communications in Statistics B - Simulation and Computation 25, 593-609.
Cuadras C, Arenas C (1990). A distance-based regression model for prediction with mixed data. Communications in Statistics A - Theory and Methods 19, 2261-2279.
Cuadras CM (1989). Distance analysis in discrimination and classification using both continuous and categorical variables. In: Y. Dodge (ed.), Statistical Data Analysis and Inference. Amsterdam, The Netherlands: North-Holland Publishing Co., pp. 459-473.