| dissimilarity {resemble} | R Documentation | 
Dissimilarity computation between matrices
Description
This is a wrapper to integrate the different dissimilarity functions of the
offered by package.It computes the dissimilarities between observations in
numerical matrices by using an specifed dissmilarity measure.
Usage
dissimilarity(Xr, Xu = NULL,
              diss_method = c("pca", "pca.nipals", "pls", "mpls",
                              "cor", "euclid", "cosine", "sid"),
              Yr = NULL, gh = FALSE, pc_selection = list("var", 0.01),
              return_projection = FALSE, ws = NULL,
              center = TRUE, scale = FALSE, documentation = character(),
              ...)
Arguments
| Xr | a matrix of containing nobservations/rows andpvariables/columns. | 
| Xu | an optional matrix containing data of a second set of observations
with pvariables/columns. | 
| diss_method | a character string indicating the method to be used to
compute the dissimilarities between observations. Options are:
 
"pca": Mahalanobis distance
computed on the matrix of scores of a Principal Component (PC)
projection ofXr(andXuif provided). PC projection is
done using the singular value decomposition (SVD) algorithm.
Seeortho_dissfunction.
"pca.nipals": Mahalanobis distance
computed on the matrix of scores of a Principal Component (PC)
projection ofXr(andXuif provided). PC projection is
done using the non-linear iterative partial least squares (nipals)
algorithm. Seeortho_dissfunction.
"pls": Mahalanobis distance
computed on the matrix of scores of a partial least squares projection
ofXr(andXuif provided). In this case,Yris
always required. Seeortho_dissfunction.
"mpls": Mahalanobis distance
computed on the matrix of scores of a modified partial least squares
projection (Shenk and Westerhaus, 1991; Westerhaus, 2014)
ofXr(andXuif provided). In this case,Yris
always required. Seeortho_dissfunction.
"cor": based on the correlation coefficient
between observations. Seecor_dissfunction.
"euclid": Euclidean distance
between observations. Seef_dissfunction.
"cosine": Cosine distance
between observations. Seef_dissfunction.
"sid": spectral information divergence between
observations. Seesidfunction.
 | 
| Yr | a numeric matrix of nobservations used as side information ofXrfor theortho_dissmethods (i.e.pca,pca.nipalsorpls). It is required when: | 
| gh | a logical indicating if the Mahalanobis distance (in the pls score
space) between each observation and the pls centre/mean must be
computed. | 
| pc_selection | a list of length 2 to be passed onto the
ortho_dissmethods. It is required if the method selected indiss_methodis any of"pca","pca.nipals"or"pls"or ifgh = TRUE. This argument is used for
optimizing the number of components (principal components or pls factors)
to be retained. This list must contain two elements in the following order:method(a character indicating the method for selecting the number of
components) andvalue(a numerical value that complements the selected
method). The methods available are: 
"opc": optimized principal component selection based on
Ramirez-Lopez et al. (2013a, 2013b). The optimal number of components
(of set of observations) is the one for which its distance matrix
minimizes the differences between theYrvalue of each
observation and theYrvalue of its closest observation. In this
casevaluemust be a value ((larger than 0 and
below the minimum dimension ofXrorXrandXucombined) indicating the maximum
number of principal components to be tested. See theortho_projectionfunction for more details.
"cumvar": selection of the principal components based
on a given cumulative amount of explained variance. In this case,valuemust be a value (larger than 0 and below or equal to 1)
indicating the minimum amount of cumulative variance that the
combination of retained components should explain.
"var": selection of the principal components based
on a given amount of explained variance. In this case,valuemust be a value (larger than 0 and below or equal to 1)
indicating the minimum amount of variance that a single component
should explain in order to be retained.
"manual": for manually specifying a fix number of
principal components. In this case,valuemust be a value
(larger than 0 and
below the minimum dimension ofXrorXrandXucombined).
indicating the minimum amount of variance that a component should
explain in order to be retained.
 The default is list(method = "var", value = 0.01). Optionally, the pc_selectionargument admits"opc"or"cumvar"or"var"or"manual"as a single character
string. In such a case the default"value"when either"opc"or"manual"are used is 40. When"cumvar"is used the default"value"is set to 0.99 and when"var"is used, the default"value"is set to 0.01. | 
| return_projection | a logical indicating if the projection(s) must be
returned. Projections are used if the ortho_dissmethods are
called (i.e.diss_method = "pca",diss_method = "pca.nipals"ordiss_method = "pls") or whengh = TRUE.
In casegh = TRUEand aortho_dissmethod is used (in thediss_methodargument), both projections are returned. | 
| ws | an odd integer value which specifies the window size, when
diss_method = "cor"(cor_dissmethod) for moving
correlation dissimilarity. Ifws = NULL(default), then the window
size will be equal to the number of variables (columns), i.e. instead moving
correlation, the normal correlation will be used. Seecor_dissfunction. | 
| center | a logical indicating if Xr(andXuif provided)
must be centered. IfXuis provided the data is centered around the
mean of the pooledXrandXumatrices (\(Xr \cup Xu\)). For
dissimilarity computations based ondiss_method = pls, the data is
always centered. | 
| scale | a logical indicating if Xr(andXuif
provided) must be  scaled. IfXuis provided the data is scaled based
on the standard deviation of the the pooledXrandXumatrices
(\(Xr \cup Xu\)). Ifcenter = TRUE, scaling is applied after
centering. | 
| documentation | an optional character string that can be used to
describe anything related to the mblcall (e.g. description of the
input data). Default:character(). NOTE: his is an experimental
argument. | 
| ... | other arguments passed to the dissimilarity functions
(ortho_diss,cor_diss,f_dissorsid). | 
Details
This function is a wrapper for ortho_diss, cor_diss,
f_diss, sid. Check the documentation of these
functions for further details.
Value
A list with the following components:
- dissimilarity: the resulting dissimilarity matrix.
 
- projection: an- ortho_projectionobject. Only output
if- return_projection = TRUEand if- diss_method = "pca",- diss_method = "pca.nipals",- diss_method = "pls"or- diss_method = "mpls".
This object contains the projection used to compute
the dissimilarity matrix. In case of local dissimilarity matrices,
the projection corresponds to the global projection used to select the
neighborhoods (see- ortho_dissfunction for further
details).
 
- gh: a list containing the GH distances as well as the
pls projection used to compute the GH.
 
Author(s)
Leonardo Ramirez-Lopez
References
Shenk, J., Westerhaus, M., and Berzaghi, P. 1997. Investigation of a LOCAL
calibration procedure for near infrared instruments. Journal of Near Infrared
Spectroscopy, 5, 223-232.
Westerhaus, M. 2014. Eastern Analytical Symposium Award for outstanding
Wachievements in near infrared spectroscopy: my contributions to
Wnear infrared spectroscopy. NIR news, 25(8), 16-20.
See Also
ortho_diss cor_diss f_diss
sid.
Examples
library(prospectr)
data(NIRsoil)
# Filter the data using the first derivative with Savitzky and Golay
# smoothing filter and a window size of 11 spectral variables and a
# polynomial order of 4
sg <- savitzkyGolay(NIRsoil$spc, m = 1, p = 4, w = 15)
# Replace the original spectra with the filtered ones
NIRsoil$spc <- sg
Xu <- NIRsoil$spc[!as.logical(NIRsoil$train), ]
Yu <- NIRsoil$CEC[!as.logical(NIRsoil$train)]
Yr <- NIRsoil$CEC[as.logical(NIRsoil$train)]
Xr <- NIRsoil$spc[as.logical(NIRsoil$train), ]
Xu <- Xu[!is.na(Yu), ]
Xr <- Xr[!is.na(Yr), ]
Yu <- Yu[!is.na(Yu)]
Yr <- Yr[!is.na(Yr)]
dsm_pca <- dissimilarity(
  Xr = Xr, Xu = Xu,
  diss_method = c("pca"),
  Yr = Yr, gh = TRUE,
  pc_selection = list("opc", 30),
  return_projection = TRUE
)
[Package 
resemble version 2.2.3 
Index]