Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
1991, Pattern Recognition
We develop a method of performing pattern recognition (discrimination and classification) using a recursive technique derived from mixture models, kernel estimation and stochastic approximation.
Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170), 1998
A classifier based on a mixture model is proposed. The EM algorithm for construction of a mixture density is sensitive to the initial densities. It is also difficult to determine the optimal number of component densities. In this study, we construct a mixture density on the basis of a hyperrectangles found in the subclass method, in which the number of components is determined automatically. Experimental results show the effectiveness of this approach.
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000
Pattern Recognition, 1993
A~tract--A recursive, nonparametric method is developed for performing density estimation derived from mixture models, kernel estimation and stochastic approximation. The asymptotic performance of the method, dubbed "adaptive mixtures" (Priebe and Marchette, Pattern Recognition 24, 1197-1209 (1991)) for its data-driven development of a mixture model approximation to the true density, is investigated using the method of sieves. Simulations are included indicating convergence properties for some simple examples.
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2004
There are two open problems when finite mixture densities are used to model multivariate data: the selection of the number of components and the initialization. In this paper, we propose an online (recursive) algorithm that estimates the parameters of the mixture and that simultaneously selects the number of components. The new algorithm starts with a large number of randomly initialized components. A prior is used as a bias for maximally structured models. A stochastic approximation recursive learning algorithm is proposed to search for the maximum a posteriori (MAP) solution and to discard the irrelevant components.
1996
This paper introduces a new adaptive mixtures type estima- tor. This paper provides details on the new estimator along with some examples of its performance on several different types of data sets. In addition we provide a comparison between the performance of this estimator and the standard recursive adaptive mixtures estimator for one of the data sets.
Pattern Recognition Letters, 2005
Mixture modeling is the problem of identifying and modeling components in a given set of data. Gaussians are widely used in mixture modeling. At the same time, other models such as Dirichlet distributions have not received attention. In this paper, we present an unsupervised algorithm for learning a finite Dirichlet mixture model. The proposed approach for estimating the parameters of a Dirichlet mixture is based on the maximum likelihood (ML) expressed in a Riemannian space. Experimental results are presented for the following applications: summarization of texture image databases for efficient retrieval, and human skin color modeling and its application to skin detection in multimedia databases.
2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence), 2008
D ensity estimation is a fundamental problem in pattern recognition and machine learning. It is particularly important for classification using the Bayes decision rule [1]. The methods for density estimation can be grouped into two categories: parametric and nonparametric. Parametric methods rely on functional density models but are compact in storage and computation. Nonparametric methods, such as Parzen windows and nearest neighbor methods, can adapt to arbitrary distributions but the model needs to store all the training points. The Gaussian mixture model (GMM) can be viewed as a semi-parametric approach. It is flexible to model irregular distributions and has moderate complexity. The Expectation Maximization (EM) algorithm [2-3], based on maximum likelihood, is a basic approach for mixture density estimation, and there has been many improved methods along this direction [1] .
Ispd, 1997
Summary This thesis investigates a recent tool in statistical analysis: the mixtures-of-experts model for classifica- tion and regression. The aim of the thesis is to place mixtures-of-experts models in context with other statistical models. The hope of doing this is that we may better understand their advantages and dis- advantages over other models. The thesis first considers mixtures-of-experts models from a theoretical perspective and compares them with other models such as trees, switching regression models and modu- lar networks. Two extensions of the mixtures-of-experts model are then proposed. The first extension is a constructive algorithm for learning model architecture and parameters which is insipired by recursive partitioning. The second extension uses Bayesian methods for learning the parameters of the model. These extensions are compared empirically with the standard mixtures-of-experts model and with other statistical models on small to medium sized data sets. In the second part of the thesis the mixtures-of- experts framework is applied to acoustic modelling within a large vocabulary speech recognition system. The mixtures-of-experts is shown to give an advantage over standard single neural network approaches on this task. The results of both of these sets of comparisons indicate that mixtures-of-experts models are competitive with other state-of-the-art statistical models.
Lecture Notes in Computer Science, 2006
Abstract. In this paper we address the problem of estimating the pa- rameters of a Gaussian mixture model. Although the EM (Expectation- Maximization) algorithm yields the maximum-likelihood solution it re- quires a careful initialization of the parameters and the optimal number of kernels in the mixture may,be unknown,beforehand. We propose a criterion based on the entropy of the pdf (probability
2011
1 Abstract Clustering is a fundamental task in many vision applications. To date, most clustering algorithms work in a batch setting and training examples must be gathered in a large group before learning can begin. Here we explore incremental clustering, in which data can arrive continuously. We present a novel incremental model-based clustering algorithm based on nonparametric Bayesian methods, which we call Memory Bounded Variational Dirichlet Process (MB-VDP).
Lecture Notes in Computer Science, 2013
Single-Gaussian and Gaussian-Mixture Models are utilized in various pattern recognition tasks. The model parameters are estimated usually via Maximum Likelihood Estimation (MLE) with respect to available training data. However, if only small amount of training data is available, the resulting model will not generalize well. Loosely speaking, classification performance given an unseen test set may be poor. In this paper, we propose a novel estimation technique of the model variances. Once the variances were estimated using MLE, they are multiplied by a scaling factor, which reflects the amount of uncertainty present in the limited sample set. The optimal value of the scaling factor is based on the Kullback-Leibler criterion and on the assumption that the training and test sets are sampled from the same source distribution. In addition, in the case of GMM, the proper number of components can be determined.
The 2013 International Joint Conference on Neural Networks (IJCNN), 2013
Instead of using single kernel, different approaches of using multiple kernels have been proposed recently in kernel learning literature, one of which is multiple kernel learning (MKL). In this paper, we propose an alternative to MKL in order to select the appropriate kernel given a pool of predefined kernels, for a family of online kernel filters called kernel adaptive filters (KAF). The need for an alternative is that, in a sequential learning method where the hypothesis is updated at every incoming sample, MKL would provide a new kernel, and thus a new hypothesis in the new reproducing kernel Hilbert space (RKHS) associated with the kernel. This does not fit well in the KAF framework, as learning a hypothesis in a fixed RKHS is the core of the KAF algorithms. Hence, we introduce an adaptive learning method to address the kernel selection problem for the KAF, based on competitive mixture of models. We propose mixture kernel least mean square (MxKLMS) adaptive filtering algorithm, where the kernel least mean square (KLMS) filters learned with different kernels, act in parallel at each input instance and are competitively combined such that the filter with the best kernel is an expert for each input regime. The competition among these experts is created by using a performance based gating, that chooses the appropriate expert locally. Therefore, the individual filter parameters as well as the weights for combination of these filters are learned simultaneously in an online fashion. The results obtained suggest that the model not only selects the best kernel, but also significantly improves the prediction accuracy.
Proceedings of the Eighth Annual Conference on Computational Learning Theory, 1995
We investigate the problem of estimating the proportion vector which maximizes the likelihood of a given sample for a mixture of given densities. We adapt a framework developed for supervised learning and give simple derivations for many of the standard iterative algorithms like gradient projection and EM. In this framework, the distance between the new and old proportion vectors is used as a penalty term. The square distance leads to the gradient projection update, and the relative entropy to a new update which we call the exponentiated gradient update (EGη). Curiously, when a second order Taylor expansion of the relative entropy is used, we arrive at an update EMη which, for η = 1, gives the usual EM update. Experimentally, both the EMη-update and the EGη-update for η > 1 outperform the EM algorithm and its variants. We also prove a polynomial bound on the rate of convergence of the EGη algorithm.
IEEE Transactions on Information Theory, 1977
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2002
In this paper we develop a dynamic continuous solution to the clustering problem of data characterized by a mixture of K distributions, where K is given a priori. The proposed solution resorts to game theory tools, in particular mean field games and can be interpreted as the continuous version of a generalized Expectation-Maximization (GEM) algorithm. The main contributions of this paper are twofold: first, we prove that the proposed solution is a GEM algorithm; second, we derive closed-form solution for a Gaussian mixture model and show that the proposed algorithm converges exponentially fast to a maximum of the log-likelihood function, improving significantly over the state of the art. We conclude the paper by presenting simulation results for the Gaussian case that indicate better performance of the proposed algorithm in term of speed of convergence and with respect to the overlap problem.
Single-Gaussian and Gaussian-Mixture Models are utilized in various pattern recognition tasks. The model parameters are estimated usually via Maximum Likelihood Estimation (MLE) with respect to available training data. However, if only small amount of training data is available, the resulting model will not generalize well. Loosely speaking, classification performance given an unseen test set may be poor. In this paper, we propose a novel estimation technique of the model variances. Once the variances were estimated using MLE, they are multiplied by a scaling factor, which reflects the amount of uncertainty present in the limited sample set. The optimal value of the scaling factor is based on the Kullback-Leibler criterion and on the assumption that the training and test sets are sampled from the same source distribution. In addition, in the case of GMM, the proper number of components can be determined.
Signal Processing
In signal processing, a large number of samples can be generated by a Monte Carlo method and then encoded as a Gaussian mixture model for compactness in computation, storage, and communication. With a large number of samples to learn from, the computational efficiency of Gaussian mixture learning becomes important. In this paper, we propose a new method of Gaussian mixture learning that works both accurately and efficiently for large datasets. The proposed method combines hierarchical clustering with the expectation-maximization algorithm, with hierarchical clustering providing an initial guess for the expectation-maximization algorithm. We also propose adaptive splitting for hierarchical clustering, which enhances the quality of the initial guess and thus improves both the accuracy and efficiency of the combination. We validate the performance of the proposed method in comparison with existing methods through numerical examples of Gaussian mixture learning and its application to distributed particle filtering.
Data Mining and Knowledge Discovery, 2018
Estimating reliable class-conditional probability is the prerequisite to implement Bayesian classifiers, and how to estimate the probability density functions (PDFs) is also a fundamental problem for other probabilistic induction algorithms. The finite mixture model (FMM) is able to represent arbitrary complex PDFs by using a mixture of mutimodal distributions, but it assumes that the component mixtures fol-Responsible editor: Fei Wang.
1996
We consider the approach to unsupervised learning whereby a normal mixture model is tted to the data by maximum likelihood. An algorithm called NMM is presented that enables the normal mixture model with either restricted or unrestricted component covariance matrices to be tted to a given data set. The algorithm automatically handles the problem of the speci cation of initial values for the parameters in the iterative tting of the model within the framework of the EM algorithm. The algorithm also has the provision to carry a test for the number of components on the basis of the likelihood ratio statistic.
IEEE Transactions on Information Theory, 2013
Recursive algorithms for the estimation of mixtures of densities have attracted a lot of attention in the last ten years. Here an algorithm for recursive estimation is studied. It complements existing approaches in the literature, as it is based on conditions that are usually very weak. For example, the parameter space over which the mixture is taken does not need to be necessarily bounded. The essence of the procedure is to combine density estimation via empirical characteristic function together with an iterative Hilbert space approximation algorithm. The conditions for consistency of the estimator are veried for three important statistical problems. A simulation study is also included.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.