ACONS: A New Algorithm for Clustering Documents

Alonso, Andrés Gago; Suárez, Airel Pérez; Pagola, José E. Medina

ACONS: A New Algorithm for Clustering Documents

Andrés Alonso

Lecture Notes in Computer Science

Sign up for access to the world's latest research

checkGet notified about relevant papers

checkSave papers to use in your research

checkJoin the discussion with peers

checkTrack your impact

Abstract

In this paper we present a new algorithm for document clustering called Condensed Star (ACONS). This algorithm is a natural evolution of the Star algorithm proposed by Aslam et al., and improved by them and other researchers. In this method, we introduced a new concept of star allowing a different star-shaped form; in this way we retain the strengths of previous algorithms as well as address previous shortcomings. The evaluation experiments on standard document collections show that the proposed algorithm outperforms previously defined methods and obtains a smaller number of clusters. Since the ACONS algorithm is relatively simple to implement and is also efficient, we advocate its use for tasks that require clustering, such as information organization, browsing, topic tracking, and new topic detection.

José Medina-Pagola

2008 IEEE International Conference on Data Mining Workshops, 2008

In this paper a new algorithm, called CStar, for document clustering is presented. This algorithm improves recently developed algorithms like Generalized Star (GStar) and ACONS algorithms, originally proposed for reducing some drawbacks presented in previous Star-like algorithms. The CStar algorithm uses the Condensed Star-shaped Subgraph concept defined by ACONS, but defines a new heuristic that allows to construct a new cover of the thresholded similarity graph and to reduce the drawbacks presented in GStar and ACONS algorithms. The experimentation over standard document collections shows that our proposal outperforms previously defined algorithms and other related algorithms used to document clustering.

Log In

ACONS: A New Algorithm for Clustering Documents

Sign up for access to the world's latest research

Abstract

Related papers

Related topics

Related papers