Bootstrap technique in cluster analysis

Anil Jain

Bootstrap technique in cluster analysis

Anil Jain

1987

Sign up for access to the world's latest research

checkGet notified about relevant papers

checkSave papers to use in your research

checkJoin the discussion with peers

checkTrack your impact

Abstract

AImtract--We define a method to estimate the number of clusters in a data set E, using the bootstrap technique. This approach involves the generation of several "fake" data sets by sampling patterns with replacement in E (bootstrapping). For each number, K, of clusters, a measure of stability of the K-cluster partitions over the bootstrap samples is used to characterize the significance of the K-cluster partition for the original data set. The value of K which provides the most stable partitions is the estimate of the number ot clusters m 6. I ne perlormance ot tam new techmque Is demonstrated on both synthetic and real data, and is applied to the segmentation of range images.

Sara Dolnicar

Marketing Letters, 2010

Segmentation results derived using cluster analysis depend on (1) the structure of the data and (2) algorithm parameters. Typically neither the data structure is assessed in advance of clustering nor is the sensitivity of the analysis to changes in algorithm parameters. We propose a benchmarking framework based on bootstrapping techniques that accounts for sample and algorithm randomness. This provides much needed guidance both to data analysts and users of clustering solutions regarding the choice of the final clusters from computations which are exploratory in nature.

Log In

Bootstrap technique in cluster analysis

Sign up for access to the world's latest research

Abstract

Related papers

Related topics

Related papers