imputeSOM {missSOM} | R Documentation |
The Self-Organizing Maps with Built-in Missing Data Imputation.
Description
imputeSOM
is an extension of the online algorithm of the 'kohonen' package where missing data are imputed during the algorithm.
All missing values are first imputed with initial values such as the mean of the observed variables.
Usage
imputeSOM(
data,
grid = somgrid(),
rlen = 100,
alpha = c(0.05, 0.01),
radius = quantile(nhbrdist, 2/3),
maxNA.fraction = 1,
keep.data = TRUE,
dist.fcts = NULL,
init
)
Arguments
data |
a |
grid |
a grid for the codebook vectors: see |
rlen |
the number of times the complete data set will be presented to the network. |
alpha |
learning rate, a vector of two numbers indicating the amount of change. Default is to decline linearly from 0.05 to 0.01 over |
radius |
the radius of the neighbourhood, either given as a single number or a vector (start, stop). If it is given as a single
number the radius will change linearly from |
maxNA.fraction |
the maximal fraction of values that may be NA to prevent the column to be removed. |
keep.data |
if TRUE, return original data and mapping information. If FALSE, only return the trained map (in essence the codebook vectors). |
dist.fcts |
distance function to be used for the data. Admissable values currently are "sumofsquares", "euclidean" and "manhattan. Default is to use "sumofsquares". |
init |
a |
Value
An object of class "missSOM" with components
data |
Data matrix, only returned if |
ximp |
Imputed data matrix. |
unit.classif |
Winning units for data objects, only returned if |
distances |
Distances of objects to their corresponding winning unit, only returned if |
grid |
The grid, an object of class |
codes |
A list of matrices containing codebook vectors. |
alpha , radius |
Input arguments presented to the function. |
maxNA.fraction |
The maximal fraction of values that may be NA to prevent the column to be removed. |
dist.fcts |
The distance function used for the data. |
See Also
somgrid, plot.missSOM
, map.missSOM
Examples
data(wines)
## Data with no missing values
som.wines <- imputeSOM(scale(wines), grid = somgrid(5, 5, "hexagonal"))
summary(som.wines)
print(dim(som.wines$data))
## Data with missing values
X <- scale(wines)
missing_obs <- sample(1:nrow(wines), 10, replace = FALSE)
X[missing_obs, 1:2] <- NaN
som.wines <- imputeSOM(X, grid = somgrid(5, 5, "hexagonal"))
summary(som.wines)
print(dim(som.wines$ximp))
print(sum(is.na(som.wines$ximp)))