Dolnicar, Sara; Leisch, Friedrich
Evaluation of Structure and Reproducibility of Cluster Solutions Using the Bootstrap.
Department of Statistics: Technical Reports, Nr. 63
Segmentation results derived using cluster analysis depend on (1)
the structure of the data and (2) algorithm parameters. Typically
neither the data structure is assessed in advance of clustering nor
is the sensitivity of the analysis to changes in algorithm
parameters. We propose a benchmarking framework based on
bootstrapping techniques that accounts for sample and algorithm
randomness. This provides much needed guidance both to data analysts
and users of clustering solutions regarding the choice of the final
clusters from computations which are exploratory in nature.