BIRCH: an efficient data clustering method for very large databases

Raghu Ramakrishnan

BIRCH: an efficient data clustering method for very large databases

Raghu Ramakrishnan

1996

Sign up for access to the world's latest research

checkGet notified about relevant papers

checkSave papers to use in your research

checkJoin the discussion with peers

checkTrack your impact

Abstract

Finding useful patterns in large datasets has attracted considerable interest recently, and one of the most widely st,udied problems in this area is the identification of clusters, or deusel y populated regions, in a multi-dir nensi onal clataset. Prior work does not adequately address the problem of large datasets and minimization of 1/0 costs.

Raghu Ramakrishnan

Data Mining and Knowledge Discovery, 1997

Data clustering is an important technique for exploratory data analysis, and has been studied for several years. It has been shown to be useful in many practical domains such as data classification and image processing. Recently, there has been a growing emphasis on exploratory analysis of very large datasets to discover useful patterns and/or correlations among attributes. This is called data mining, and data clustering is regarded as a particular branch. However existing data clustering methods do not adequately address the problem of processing large datasets with a limited amount of resources (e.g., memory and cpu cycles). So as the dataset size increases, they do not scale up well in terms of memory requirement, running time, and result quality.

Log In

BIRCH: an efficient data clustering method for very large databases

Sign up for access to the world's latest research

Abstract

Related papers

Related topics