Adapting K-Means Algorithm for Discovering Clusters in Subspaces

Zhao, Yanchang; Zhang, Chengqi; Zhang, Shichao; Zhao, Lianwei

Adapting K-Means Algorithm for Discovering Clusters in Subspaces

Yanchang Zhao

2006, Lecture Notes in Computer Science

Sign up for access to the world's latest research

checkGet notified about relevant papers

checkSave papers to use in your research

checkJoin the discussion with peers

checkTrack your impact

Abstract

Subspace clustering is a challenging task in the field of data mining. Traditional distance measures fail to differentiate the furthest point from the nearest point in very high dimensional data space. To tackle the problem, we design minimal subspace distance which measures the similarity between two points in the subspace where they are nearest to each other. It can discover subspace clusters implicitly when measuring the similarities between points. We use the new similarity measure to improve traditional k-means algorithm for discovering clusters in subspaces. By clustering with low-dimensional minimal subspace distance first, the clusters in low-dimensional subspaces are detected. Then by gradually increasing the dimension of minimal subspace distance, the clusters get refined in higher dimensional subspaces. Our experiments on both synthetic data and real data show the effectiveness of the proposed similarity measure and algorithm.

Kelvin Sim

Statistical Analysis and Data Mining, 2009

Traditional similarity measurements often become meaningless when dimensions of datasets increase. Subspace clustering has been proposed to find clusters embedded in subspaces of high dimensional datasets.

Log In

Adapting K-Means Algorithm for Discovering Clusters in Subspaces

Sign up for access to the world's latest research

Abstract

Related papers

Related topics