The trainSOM
function returns a somRes
class
object which contains the outputs of the algorithm.
a data frame or matrix containing the observations to be mapped on the grid by the SOM algorithm.
Further arguments to be passed to the function
initSOM
for specifying the parameters of the algorithm. The
default values of the arguments maxit
and dimension
are
calculated according to the SOM type if the user does not set them:
maxit
is equal to (number of rows+number of columns)*5 if the
SOM type is korresp
. It is equal to number of rows*5 in all other
SOM types
dimension
: for a korresp
SOM, is approximately equal to
the square root of the number of observations to be classified divided by
10 but it is never smaller than 5 or larger than 10.
an object of class somRes
.
an object of class somRes
.
The trainSOM
function returns an object of class somRes
which contains the following components:
the final classification of the data.
the final coordinates of the prototypes.
the final energy of the map. For the numeric case, energy with data having missing entries is based on data imputation as described in Cottrell and Letrémy (2005b).
a list containing some intermediate backups of the
prototypes coordinates, clustering, energy and the indexes of the recorded
backups, if nb.save
is set to a value larger than 1.
the original dataset used to train the algorithm.
a list of the map's parameters, which is an object of
class paramSOM
as produced by the function initSOM
.
The function summary.somRes
also provides an ANOVA (ANalysis Of
VAriance) of each input numeric variables in function of the map's clusters.
This is helpful to see which variables participate to the clustering.
The version of the SOM algorithm implemented in this package is the stochastic version.
Several variants able to handle non-vectorial data are also implemented in
their stochastic versions: type="korresp"
for contingency tables, as
described in Cottrell et al. (2004) (with weights as in Cottrell and Letrémy,
2005a); type = "relational"
for dissimilarity matrices, as described
in Olteanu et al. (2015), with the fast implementation introduced in Mariette
et al. (2017).
Missing values are handled as described in Cottrell et al. (2005b), not using
missing entries of the selected observation during winner computation or
prototype updates. This allows to proceed with the imputation of missing
entries with the corresponding entries of the cluster prototype (with
impute
).
summary
produces a complete summary of the results that
displays the parameters of the SOM, quality criteria and ANOVA. For
type = "numeric"
the ANOVA is performed for each input variable and
test the difference of this variable across the clusters of the map. For
type = "relational"
a dissimilarity ANOVA is performed (Anderson,
2001), except that in the present version, a crude estimate of the p-value is
used which is based on the Fisher distribution and not on a permutation test.
Warning! Recording intermediate backups with the argument
nb.save
can strongly increase the computational time since calculating
the entire clustering and the energy is time consuming. Use this option with
care and only when it is strictly necessary.
Anderson M.J. (2001). A new method for non-parametric multivariate analysis of variance. Austral Ecology, 26, 32-46.
Kohonen T. (2001) Self-Organizing Maps. Berlin/Heidelberg: Springer-Verlag, 3rd edition.
Cottrell M., Ibbou S., Letrémy P. (2004) SOM-based algorithms for qualitative variables. Neural Networks, 17, 1149-1167.
Cottrell M., Letrémy P. (2005a) How to use the Kohonen algorithm to simultaneously analyse individuals in a survey. Neurocomputing, 21, 119-138.
Cottrell M., Letrémy P. (2005b) Missing values: processing with the Kohonen algorithm. Proceedings of Applied Stochastic Models and Data Analysis (ASMDA 2005), 489-496.
Olteanu M., Villa-Vialaneix N. (2015) On-line relational and multiple relational SOM. Neurocomputing, 147, 15-30.
Mariette J., Rossi F., Olteanu M., Mariette J. (2017) Accelerating stochastic kernel SOM. In: M. Verleysen, XXVth European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning (ESANN 2017), i6doc, Bruges, Belgium, 269-274.
See initSOM
for a description of the parameters to
pass to the trainSOM function to change its behavior and
plot.somRes
to plot the outputs of the algorithm.
# Run trainSOM algorithm on the iris data with 500 iterations
iris.som <- trainSOM(x.data=iris[,1:4])
iris.som
#> Self-Organizing Map object...
#> online learning, type: numeric
#> 5 x 5 grid with square topology
#> neighbourhood type: gaussian
#> distance type: euclidean
summary(iris.som)
#>
#> Summary
#>
#> Class : somRes
#>
#> Self-Organizing Map object...
#> online learning, type: numeric
#> 5 x 5 grid with square topology
#> neighbourhood type: gaussian
#> distance type: euclidean
#>
#> Final energy : 0.8543395
#> Topographic error: 0
#>
#> ANOVA :
#>
#> Degrees of freedom : 14
#>
#> F pvalue significativity
#> Sepal.Length 42.669 0 ***
#> Sepal.Width 18.012 0 ***
#> Petal.Length 295.589 0 ***
#> Petal.Width 160.906 0 ***
#>