Academia.eduAcademia.edu

Bagging for path-based clustering

2003

Abstract

A resampling scheme for clustering with similarity to bootstrap aggregation (bagging) is presented. Bagging is used to improve the quality of pathbased clustering, a data clustering method that can extract elongated structures from data in a noise robust way. The results of an agglomerative optimization method are influenced by small fluctuations of the input data. To increase the reliability of clustering solutions, a stochastic resampling method is developed to infer consensus clusters. A related reliability measure allows us to estimate the number of clusters, based on the stability of an optimized cluster solution under resampling. The quality of path-based clustering with resampling is evaluated on a large image dataset of human segmentations.