Parallel wavelet-based clustering algorithm on GPUs using CUDA

Cem Özdoğan

Parallel wavelet-based clustering algorithm on GPUs using CUDA

Cem Özdoğan

2011, Procedia Computer Science

Sign up for access to the world's latest research

checkGet notified about relevant papers

checkSave papers to use in your research

checkJoin the discussion with peers

checkTrack your impact

Abstract

There has been a substantial interest in scientific and engineering computing community to speed up the CPU-intensive tasks on graphical processing units (GPUs) with the development of many-core GPUs as having very large memory bandwidth and computational power. Cluster analysis is a widely used technique for grouping a set of objects into classes of "similar" objects and commonly used in many fields such as data mining, bioinformatics and pattern recognition. WaveCluster defines the notion of cluster as a dense region consisting of connected components in the transformed feature space. In this study, we present the implementation of WaveCluster algorithm as a novel clustering approach based on wavelet transform to GPU level parallelization and investigate the parallel performance for very large spatial datasets. The CUDA implementations of two main sub-algorithms of WaveCluster approach; namely extraction of low-frequency component from the signal using wavelet transform and connected component labeling are presented. Then, the corresponding performance evaluations are reported for each sub-algorithm. Divide and conquer approach is followed on the implementation of wavelet transform and multi-pass sliding window approach on the implementation of connected component labeling. The maximum achieved speedup is found in kernel as 107x in the computation of extraction of the low-frequency component and 6x in the computation of connected component labeling with respect to the sequential algorithms running on the CPU.

Ahmet Cem Yıldırım

Journal of Parallel and Distributed Computing, 2011

A linear scaling parallel clustering algorithm implementation and its application to very large datasets for cluster analysis is reported. WaveCluster is a novel clustering approach based on wavelet transforms. Despite this approach has an ability to detect clusters of arbitrary shapes in an efficient way, it requires considerable amount of time to collect results for large sizes of multi-dimensional datasets. We propose the parallel implementation of the WaveCluster algorithm based on the message passing model for a distributed-memory multiprocessor system. In the proposed method, communication among processors and memory requirements are kept at minimum to achieve high efficiency. We have conducted the experiments on a dense dataset and a sparse dataset to measure the algorithm behavior appropriately. Our results obtained from performed experiments demonstrate that developed parallel WaveCluster algorithm exposes high speedup and scales linearly with the increasing number of processors.

Log In

Parallel wavelet-based clustering algorithm on GPUs using CUDA

Sign up for access to the world's latest research

Abstract

Related papers

Related topics