Dolnicar, Sara; Leisch, Friedrich (2009): Evaluation of Structure and Reproducibility of Cluster Solutions Using the Bootstrap. Department of Statistics: Technical Reports, Nr. 63




Segmentation results derived using cluster analysis depend on (1) the structure of the data and (2) algorithm parameters. Typically neither the data structure is assessed in advance of clustering nor is the sensitivity of the analysis to changes in algorithm parameters. We propose a benchmarking framework based on bootstrapping techniques that accounts for sample and algorithm randomness. This provides much needed guidance both to data analysts and users of clustering solutions regarding the choice of the final clusters from computations which are exploratory in nature.