Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2007, … Symposium on Advanced …
…
6 pages
1 file
In this paper, we propose a method for image clustering using multinomial mixture models. The mixture of multinomial distributions, often called multinomial mixture, is a probabilistic model mainly used for text mining. The effectiveness of multinomial distribution for text mining originates from the fact that words can be regarded as independently generated in the first approximation. In this paper, we apply multinomial distribution to image clustering. We regard each color as a "word" and color histograms as "term frequency" distributions.
2003
The Dirichlet distribution offers high flexibility for modeling data. This paper describes two new mixtures based on this density: the GDD (Generalized Dirichlet Distribution) and the MDD (Multinomial Dirichlet Distribution) mixtures. These mixtures will be used to model continuous and discrete data, respectively. We propose a method for estimating the parameters of these mixtures. The performance of our method is tested by contextual evaluations. In these evaluations we compare the performance of Gaussian and GDD mixtures in the classification of several pattern-recognition data sets and we apply the MDD mixture to the problem of summarizing image databases.
IEEE Transactions on Image Processing, 2004
This paper presents an unsupervised algorithm for learning a finite mixture model from multivariate data. This mixture model is based on the Dirichlet distribution, which offers high flexibility for modeling data. The proposed approach for estimating the parameters of a Dirichlet mixture is based on the maximum likelihood (ML) and Fisher scoring methods. Experimental results are presented for the following applications: estimation of artificial histograms, summarization of image databases for efficient retrieval, and human skin color modeling and its application to skin detection in multimedia databases.
Multimedia Tools and Applications, 2017
Image content clustering is an effective way to organize large databases thereby making the content based image retrieval process much easier. However, clustering of images with varied background and foreground is quite challenging. In this paper, we propose a novel image content clustering paradigm suitable for clustering large and diverse image databases. In our approach images are represented in a continuous domain based on a probabilistic Gaussian Mixture Model (GMM) with the images modeled as mixture of Gaussian distributions in the selected feature space. The distance metric between the Gaussian distributions is defined in the sense of Kullback-Leibler (KL) divergence. The clustering is done using a semi-supervised learning framework where labeled data in the form of cluster templates is used to classify the unlabelled data. The clusters are formed around initially chosen seeds and are updated in the due course based on user inputs. In our clustering approach the user interaction is done in a structured way as to get maximum inputs from the user in a limited time. We propose two methods to carry out the structured user interaction using which the cluster templates are updated to improve the quality of the clusters formed. The proposed method is experimentally evaluated on benchmark datasets that are specifically chosen to include a wide variation of images around a common theme that is typically encountered in applications like photo-summarization and poses a major semantic gap challenge to conventional clustering approaches. The experimental results presented demonstrate the effectiveness of the proposed approach.
Journal of Visual Communication and Image Representation, 2007
This paper presents an unsupervised learning algorithm for fitting a finite mixture model based on the Multinomial Dirichlet distribution (MDD). This mixture is particularly useful for modeling discrete data (vectors of counts). The algorithm proposed is based on the expectation maximization (EM) approach. This mixture is used to improve image databases categorization by integrating semantic features and to produce a new texture model. For the texture modeling problem, the results are reported on the Vistex texture image database from the MIT Media Lab.
Model-based approaches have become important tools to model data and infer knowledge. Such approaches are often used for clustering and object recognition which are crucial steps in many applications, including but not limited to, recommendation systems, search engines, cyber security, surveillance and object tracking. Many of these applications have the urgent need to reduce the semantic gap of data representation between the system level and the human being understandable level. Indeed, the low level features extracted to represent a given object can be confusing to machines which cannot differentiate between very similar objects trivially distinguishable by human beings (e.g. apple vs. tomato). In this paper, we propose a novel hierarchical methodology for data representation using a hierarchical mixture model. The proposed approach allows to model a given object class by a set of modes deduced by the system and grouped according to a labeled training data representing the human level semantic. We have used the inverted Dirichlet distribution to build our statistical framework. The proposed approach has been validated using both synthetic data and a challenging application namely visual object clustering and recognition. The presented model is shown to have a flexible hierarchy that can be changed on the fly within costless computational time.
1997
We consider a model-based approach to clustering, whereby each observation is assumed to have arisen from an underlying mixture of a nite number of distributions. The number of components in this mixture model corresponds to the number of clusters to be imposed on the data. A common assumption is to take the component distributions to be multivariate normal with perhaps some restrictions on the component covariance matrices. The model can be tted to the data using maximum likelihood implemented via the EM algorithm. There is a number of computational issues associated with the tting, including the speci cation of initial starting points for the EM algorithm and the carrying out of tests for the number of components in the nal version of the model. We shall discuss some of these problems and describe an algorithm that attempts to handle them automatically.
2006
This thesis proposes a new method for unsupervised image clustering using probabilistic continuous models and information theoretic principles. Image clustering relates to content-based image retrieval systems. It enables the implementation of efficient retrieval algorithms and the creation of a user friendly interface to the database. The thesis presents a coherent theory for continuous probabilistic image modeling based on mixture of Gaussians densities. The continuous image modeling is extended to the modeling of an image-set created by a supervised or an unsupervised clustering process. Three ways of obtaining the image-set model are introduced and the difference between them is discussed. Supervised image-set (category) modeling is utilized to compare between the proposed continuous models and the more traditional discrete image modeling based on histograms. The unsupervised image clustering framework is based on a continuous version of a recently introduced information theoret...
International Journal of Computer Applications, 2012
Never before in history has image data been generated at such high volumes as it is today. If images are analyzed properly, they can reveal useful information to the users. Image mining deals with the extraction of implicit knowledge, image data relationship, or other patterns not explicitly stored in the images. Image clustering involves the extraction of features from image databases and then application of data mining algorithm to group images. In this paper a data mining approach to cluster the images using color and texture features are proposed. Three techniques are proposed to extract Color feature, using Color Moments, Block Truncation Coding algorithm and histogram method. To extract texture feature concept of Gray Level Co-occurrence Matrix is extended and applied to color images. K-means clustering algorithm is applied to groups the images.
staff.ui.ac.id
Image clustering is a process of grouping images based on their similarity. The image clustering usually uses the color component, texture, edge, shape, or mixture of two components, etc. This research aims to explore image clustering using color composition. In order to complete this image clustering, three main components should be considered, which are color space, image representation (feature extraction), and clustering method itself. We aim to explore which composition of these factors will produce the best clustering results by combining various techniques from the three components. The color spaces use RGB, HSV, and L*a*b* method. The image representations use Histogram and Gaussian Mixture Model (GMM), whereas the clustering methods use K-Means and Agglomerative Hierarchical Clustering algorithm. The results of the experiment show that GMM representation is better combined with RGB and L*a*b* color space, whereas Histogram is better combined with HSV. The experiments also show that K-Means is better than Agglomerative Hierarchical for images clustering.
Neural, Parallel and Scientific Computations, 1999
A general model for estimating the pdf of a gray-level image histogram is reported. The histogram's pdf is approached by a mixture of Gaussian distributions. The originality of this work lies in the determination of the number of components in the mixture, which is considered as a parameter of the model and is determined using a novel algorithm. For this purpose, the model is divided into three parts. First, we use the k-means algorithm to set the initial values for the parameters of each component in the mixture. Our contributions are the determination of an appropriate numberof clusters in the k-means algorithm and a novel algorithm for eliminating false clusters. Finally, the values of the parameters are re ned using the EM algorithm. The model has been validated on both arti cial and real image histograms. Neural, Parallel and Scientific Computations, no. 7, p. 103-118, July 1999
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.
PloS one, 2017
Expert Systems with Applications, 2012
2004
Pattern Recognition Letters, 2005
Proceedings of the 23rd international conference on Machine learning - ICML '06, 2006
2010 International Conference on Pattern …, 2010
Cornell University - arXiv, 2022
Neural Processing Letters, 2013
KSII Transactions on Internet and Information Systems, 2011
Kybernetika -Praha-
International Journal of Interactive Multimedia and Artificial Intelligence, 2018
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000
IEEE Transactions on Image Processing, 2000
Proceedings 15th International Conference on Pattern Recognition. ICPR-2000, 2000
Eprint Arxiv 1111 0352, 2011