Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2010, Image and Vision Computing
In this paper we propose a Gaussian-kernel-based online kernel density estimation which can be used for applications of online probability density estimation and online learning. Our approach generates a Gaussian mixture model of the observed data and allows online adaptation from positive examples as well as from the negative examples. The adaptation from the negative examples is realized by a novel concept of unlearning in mixture models. Low complexity of the mixtures is maintained through a novel compression algorithm. In contrast to the existing approaches, our approach does not require fine-tuning parameters for a specific application, we do not assume specific forms of the target distributions and temporal constraints are not assumed on the observed data. The strength of the proposed approach is demonstrated with examples of online estimation of complex distributions, an example of unlearning, and with an interactive learning of basic visual concepts.
arXiv (Cornell University), 2016
In this paper we present xokde++, a state-of-the-art online kernel density estimation approach that maintains Gaussian mixture models input data streams. The approach follows state-of-the-art work on online density estimation, but was redesigned with computational efficiency, numerical robustness, and extensibility in mind. Our approach produces comparable or better results than the current state-of-the-art, while achieving significant computational performance gains and improved numerical stability. The use of diagonal covariance Gaussian kernels, which further improve performance and stability, at a small loss of modelling quality, is also explored. Our approach is up to 40 times faster, while requiring 90% less memory than the closest state-of-the-art counterpart.
Pattern Recognition, 2011
This document includes some detailed supplemental derivations used in the bandwidth estimation for the online Kernel Density Estimator which was proposed in the paper "Multivariate Online Kernel Density Estimation with Gaussian Kernels" by authors Matej Kristan, Aleš Leonardis, Danijel Skočaj (submitted to the journal of Pattern Recognition).
2008
In this paper we propose a new incremental estimation of Gaussian mixture models which can be used for applications of online learning. Our approach allows for adding new samples incrementally as well as removing parts of the mixture by the process of unlearning. Low complexity of the mixtures is maintained through a novel compression algorithm. In contrast to the existing approaches, our approach does not require fine-tuning parameters for a specific application, we do not assume specific forms of the target distributions and temporal constraints are not assumed on the observed data. The strength of the proposed approach is demonstrated with an example of online estimation of a complex distribution, an example of unlearning, and with an interactive learning of basic visual concepts.
2010 20th International Conference on Pattern Recognition, 2010
We propose a new method for a supervised online estimation of probabilistic discriminative models for classification tasks. The method estimates the class distributions from a stream of data in form of Gaussian mixture models (GMM). The reconstructive updates of the distributions are based on the recently proposed online Kernel Density Estimator (oKDE). We maintain the number of components in the model low by compressing the GMMs from time to time. We propose a new cost function that measures loss of interclass discrimination during compression, thus guiding the compression towards simpler models that still retain discriminative properties. The resulting classifier thus independently updates the GMM of each class, but these GMMs interact during their compression through the proposed cost function. We call the proposed method the online discriminative Kernel Density Estimator (odKDE). We compare the odKDE to oKDE, batch state-of-the-art KDEs and batch/incremental support vector machines (SVM) on the publicly-available datasets. The odKDE achieves comparable classification performance to that of best batch KDEs and SVM, while allowing online adaptation from large datasets, and produces models of lower complexity than the oKDE.
… on Intelligent Technologies (InTech 2008), Samni, …, 2008
How to model a concept, and how to discover a new concept, remain fundamental in machine learning research. Real world concepts are usually high-dimensional and have complicated distributions. Gaussian Mixture Model has strength in modeling complicated distributions. In this paper, we propose a data-driven concept modeling and discovery framework using GMM, with online updating mechanism for fast computation in real world application. Experiments show that our proposed algorithm can handle complicated concepts modeling and discovery with satisfactory performance in real time.
Journal of Machine Learning Research - JMLR, 2009
Learning algorithms are based on samples which are often drawn independently from an identical distribution (i.i.d.). In this paper we consider a differen t setting with samples drawn according to a non-identical sequence of probability distributions. Eac h time a sample is drawn from a different distribution. In this setting we investigate a fully online learning algorithm associated with a general convex loss function and a reproducing kernel Hilbert space (RKHS). Error analysis is conducted under the assumption that the sequence of marginal distributions converges polynomially in the dual of a H¨ older space. For regression with least square or insensitiv e loss, learning rates are given in both the RKHS norm and the L2 norm. For classification with hinge loss and support vector machine q-norm loss, rates are explicitly stated with respect to the e xcess misclassification error.
Proceedings of the 25th …, 2008
Moment matching is a popular means of parametric density estimation. We extend this technique to nonparametric estimation of mixture models. Our approach works by embedding distributions into a reproducing kernel Hilbert space, and performing moment matching in that space. This allows us to tailor density estimators to a function class of interest (i.e., for which we would like to compute expectations). We show our density estimation approach is useful in applications such as message compression in graphical models, and image classification and retrieval.
Neural Computation, 2013
This review examines kernel methods for online learning, in particular, multiclass classification. We examine margin-based approaches, stemming from Rosenblatt's original perceptron algorithm, as well as nonparametric probabilistic approaches that are based on the popular gaussian process framework. We also examine approaches to online learning that use combinations of kernels-online multiple kernel learning. We present empirical validation of a wide range of methods on a protein fold recognition data set, where different biological feature types are available, and two object recognition data sets, Caltech101 and Caltech256, where multiple feature spaces are available in terms of different image feature extraction methods. Neural Computation 25, 567-625 (2013) c 2013 Massachusetts Institute of Technology
2011 International Conference on Multimedia Technology, 2011
This paper examines parametric density estimation using a variable weighted sum of Gaussian kernels, where the weights may take positive and negative values. Various statistical properties of the estimator are studied as well as its extensions to multidimensional probability density estimation. Identification of the estimator parameters are computed by a modified EM algorithm and the number of kernels are estimated by information theoretic approach, using the Akiake Information Criterion (AIC). This paper provides empirical evaluation of the estimator with respect to window-based estimators and the classical linear combinations of Gaussian estimator that uses only positive weights, showing its robustness (in terms of accuracy and speed) for various applications in image and signal analysis and machine learning.
2018
Density estimation has wide applications in machine learning and data analysis techniques including clustering, classification, multimodality analysis, bump hunting and anomaly detection. In high-dimensional space, sparsity of data in local neighborhood makes many of parametric and nonparametric density estimation methods mostly inefficient. This work presents development of computationally efficient algorithms for highdimensional density estimation, based on Bayesian sequential partitioning (BSP). Copula transform is used to separate the estimation of marginal and joint densities, with the purpose of reducing the computational complexity and estimation error. Using this separation, a parallel implementation of the density estimation algorithm on a 4-core CPU is presented. Also, some example applications of the high-dimensional density estimation in density-based classification and clustering are presented. Another challenge in the area of density estimation rises in dealing with on...
IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans, 1999
Abstract| We address the problem of probability density function estimation using a Gaussian mixture model updated with the expectation-maximization (EM) algorithm. To deal with the case of an unknown number of mixing kernels, we de ne a new measure for Gaussian mixtures, called total kurtosis, which is based on the weighted sample kurtoses of the kernels. This measure provides an indication of how well the Gaussian mixture ts the data. Then we propose a new dynamic algorithm for Gaussian mixture density estimation which monitors the total kurtosis at each step of the EM algorithm in order to decide dynamically on the correct number of kernels and possibly escape from local maxima. We show the potential of our technique in approximating unknown densities through a series of examples with several density estimation problems.
A common problem of kernel-based online algorithms, such as the kernel-based Perceptron algorithm, is the amount of memory required to store the online hypothesis, which may increase without bound as the algorithm progresses. Furthermore, the computational load of such algorithms grows linearly with the amount of memory used to store the hypothesis. To attack these problems, most previous work has focused on discarding some of the instances, in order to keep the memory bounded. In this paper we present a new algorithm, in which the instances are not discarded, but are instead projected onto the space spanned by the previous online hypothesis. We call this algorithm Projectron. While the memory size of the Projectron solution cannot be predicted before training, we prove that its solution is guaranteed to be bounded. We derive a relative mistake bound for the proposed algorithm, and deduce from it a slightly different algorithm which outperforms the Perceptron. We call this second algorithm Projectron++. We show that this algorithm can be extended to handle the multiclass and the structured output settings, resulting, as far as we know, in the first online bounded algorithm that can learn complex classification tasks. The method of bounding the hypothesis representation can be applied to any conservative online algorithm and to other online algorithms, as it is demonstrated for ALMA 2 . Experimental results on various data sets show the empirical advantage of our technique compared to various bounded online algorithms, both in terms of memory and accuracy.
2011 IEEE Congress of Evolutionary Computation (CEC), 2011
In this paper, we propose an estimation of distribution algorithm based on an inexpensive Gaussian mixture model with online learning, which will be employed in dynamic optimization. Here, the mixture model stores a vector of sufficient statistics of the best solutions, which is subsequently used to obtain the parameters of the Gaussian components. This approach is able to incorporate into the current mixture model potentially relevant information of the previous and current iterations. The online nature of the proposal is desirable in the context of dynamic optimization, where prompt reaction to new scenarios should be promoted. To analyze the performance of our proposal, a set of dynamic optimization problems in continuous domains was considered with distinct levels of complexity, and the obtained results were compared to the results produced by other existing algorithms in the dynamic optimization literature.
Studies in Applied Mathematics, 2010
Gaussians are important tools for learning from data of large dimensions. The variance of a Gaussian kernel is a measurement of the frequency range of function components or features retrieved by learning algorithms induced by the Gaussian. The learning ability and approximation power increase when the variance of the Gaussian decreases. Thus, it is natural to use Gaussians with decreasing variances for online algorithms when samples are imposed one by one. In this paper, we consider fully online classification algorithms associated with a general loss function and varying Gaussians which are closely related to regularization schemes in reproducing kernel Hilbert spaces. Learning rates are derived in terms of the smoothness of a target function associated with the probability measure controlling sampling and the loss function. A critical estimate is given for the norm of the difference of regularized target functions as the variance of the Gaussian changes. Concrete learning rates are presented for the online learning algorithm with the least square loss function.
Data Mining and Knowledge Discovery, 2018
Estimating reliable class-conditional probability is the prerequisite to implement Bayesian classifiers, and how to estimate the probability density functions (PDFs) is also a fundamental problem for other probabilistic induction algorithms. The finite mixture model (FMM) is able to represent arbitrary complex PDFs by using a mixture of mutimodal distributions, but it assumes that the component mixtures fol-Responsible editor: Fei Wang.
This work builds upon previous efforts in online incremental learning, namely the Incremental Gaussian Mixture Network (IGMN). The IGMN is capable of learning from data streams in a single-pass by improving its model after analyzing each data point and discarding it thereafter. Nevertheless, it suffers from the scalability point-of-view, due to its asymptotic time complexity of ONKD3for N data points, K Gaussian components and D dimensions, rendering it inadequate for high-dimensional data. In this paper, we manage to reduce this complexity to ONKD2by deriving formulas for working directly with precision matrices instead of covariance matrices. The final result is a much faster and scalable algorithm which can be applied to high dimensional tasks. This is confirmed by applying the modified algorithm to high-dimensional classification datasets.
This work builds upon previous efforts in online incremental learning, namely the Incremental Gaussian Mixture Network (IGMN). The IGMN is capable of learning from data streams in a single-pass by improving its model after analyzing each data point and discarding it thereafter. Nevertheless, it suffers from the scalability point-of-view, due to its asymptotic time complexity of O(NKD3) for N data points, K Gaussian components and D dimensions, rendering it inadequate for high-dimensional data. In this work, we manage to reduce this complexity to O(NKD2) by deriving formulas for working directly with precision matrices instead of covariance matrices. The final result is a much faster and scalable algorithm which can be applied to high dimensional tasks. This is confirmed by applying the modified algorithm to high-dimensional classification datasets.
2021
We propose a novel Neyman-Pearson (NP) classification algorithm, which achieves the maximum detection rate and meanwhile keeps the false alarm rate around a user-specified threshold. The proposed method processes data in an online framework with nonlinear modeling capabilities by transforming the observations into a high dimensional space via the random Fourier features. After this transformation, we use a linear classifier whose parameters are sequentially learned. We emphasize that our algorithm is the first online Neyman-Pearson classifier in the literature, which is suitable for both linearly and nonlinearly separable datasets. In our experiments, we investigate the performance of our algorithm on well-known datasets and observe that the proposed online algorithm successfully learns the nonlinear class separations (by outperforming the linear models) while matching the desired false alarm rate.
IEEE Transactions on Neural Networks, 2001
We present probabilistic models which are suitable for class conditional density estimation and can be regarded as shared kernel models where sharing means that each kernel may contribute to the estimation of the conditional densities of all classes. We first propose a model that constitutes an adaptation of the classical radial basis function (RBF) network (with full sharing of kernels among classes) where the outputs represent class conditional densities. In the opposite direction is the approach of separate mixtures model where the density of each class is estimated using a separate mixture density (no sharing of kernels among classes). We present a general model that allows for the expression of intermediate cases where the degree of kernel sharing can be specified through an extra model parameter. This general model encompasses both above mentioned models as special cases. In all proposed models the training process is treated as a maximum likelihood problem and expectation-maximization (EM) algorithms have been derived for adjusting the model parameters.
Journal of the Royal Statistical Society: Series B (Statistical Methodology)
Current tools for multivariate density estimation struggle when the density is concentrated near a nonlinear subspace or manifold. Most approaches require choice of a kernel, with the multivariate Gaussian by far the most commonly used. Although heavy-tailed and skewed extensions have been proposed, such kernels cannot capture curvature in the support of the data. This leads to poor performance unless the sample size is very large relative to the dimension of the data. This article proposes a novel generalization of the Gaussian distribution, which includes an additional curvature parameter. We refer to the proposed class as Fisher-Gaussian (FG) kernels, since they arise by sampling from a von Mises-Fisher density on the sphere and adding Gaussian noise. The FG density has an analytic form, and is amenable to straightforward implementation within Bayesian mixture models using Markov chain Monte Carlo. We provide theory on large support, and illustrate gains relative to competitors in simulated and real data applications.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.