Academia.eduAcademia.edu

Validity Index and number of clusters

Abstract

Clustering (or cluster analysis) has been used widely in pattern recognition, image processing, and data analysis. It aims to organize a collection of data items into c clusters, such that items within a cluster are more similar to each other than they are items in the other clusters. The number of clusters c is the most important parameter, in the sense that the remaining parameters have less influence on the resulting partition. To determine the best number of classes several methods were made, and are called validity index. This paper presents a new validity index for fuzzy clustering called a Modified Partition Coefficient And Exponential Separation (MPCAES) index. The efficiency of the proposed MPCAES index is compared with several popular validity indexes. More information about these indexes is acquired in series of numerical comparisons and also real data Iris.