Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2009, 電子情報通信学会技術研究報告
Recent research has shown that combining various image features significantly improves the object classification performance. Multiple kernel learning (MKL) approaches, where the mixing weights at the kernel level are optimized simultaneously with the classifier parameters, give a well founded framework to control the importance of each feature. As alternatives, we can also use boosting approaches, where single kernel classifier outputs are combined with the optimal mixing weights. Most of those approaches employ an ℓ 1regularization on the mixing weights that promote sparse solutions. Although sparsity offers several advantages, e.g., interpretability and less calculation time in test phase, the accuracy of sparse methods is often even worse than the simplest flat weights combination. In this paper, we compare the accuracy of our recently developed non-sparse methods with the standard sparse counterparts on the PASCAL VOC
2009
Recent research has shown that combining various image features significantly improves the object classification performance. Multiple kernel learning (MKL) approaches, where the mixing weights at the kernel level are optimized simultaneously with the classifier parameters, give a well founded framework to control the importance of each feature. As alternatives, we can also use boosting approaches, where single kernel classifier outputs are combined with the optimal mixing weights. Most of those approaches employ an ℓ 1regularization on the mixing weights that promote sparse solutions. Although sparsity offers several advantages, e.g., interpretability and less calculation time in test phase, the accuracy of sparse methods is often even worse than the simplest flat weights combination. In this paper, we compare the accuracy of our recently developed non-sparse methods with the standard sparse counterparts on the PASCAL VOC
2009
Combining information from various image descriptors has become a standard technique for image classification tasks. Multiple kernel learning (MKL) approaches allow to determine the optimal combination of such similarity matrices and the optimal classifier simultaneously. Most MKL approaches employ an 1 -regularization on the mixing coefficients to promote sparse solutions; an assumption that is often violated in image applications where descriptors hardly encode orthogonal pieces of information. In this paper, we compare 1 -MKL with a recently developed non-sparse MKL in object classification tasks. We show that the non-sparse MKL outperforms both the standard MKL and SVMs with average kernel mixtures on the PASCAL VOC data sets.
PLoS ONE, 2012
Combining information from various image features has become a standard technique in concept recognition tasks. However, the optimal way of fusing the resulting kernel functions is usually unknown in practical applications. Multiple kernel learning (MKL) techniques allow to determine an optimal linear combination of such similarity matrices. Classical approaches to MKL promote sparse mixtures. Unfortunately, so-called 1-norm MKL variants are often observed to be outperformed by an unweighted sum kernel. The contribution of this paper is twofold: We apply a recently developed non-sparse MKL variant to state-of-the-art concept recognition tasks within computer vision. We provide insights on benefits and limits of non-sparse MKL and compare it against its direct competitors, the sum kernel SVM and the sparse MKL. We report empirical results for the PASCAL VOC 2009 Classification and ImageCLEF2010 Photo Annotation challenge data sets. About to be submitted to PLoS ONE.
Lecture Notes in Computer Science, 2010
In order to achieve good performance in image annotation tasks, it is necessary to combine information from various image features. In recent competitions on photo annotation, many groups employed the bag-of-words (BoW) representations based on the SIFT descriptors over various color channels. In fact, it has been observed that adding other less informative features to the standard BoW degrades recognition performances. In this contribution, we will show that even primitive color histograms can enhance the standard classifiers in the ImageCLEF 2009 photo annotation task, if the feature weights are tuned optimally by non-sparse multiple kernel learning (MKL) proposed by Kloft et al.. Additionally, we will propose a sorting scheme of image subregions to deal with spatial variability within each visual concept.
Procedings of the British Machine Vision Conference 2011, 2011
Augmented Kernel Matrix (AKM) has recently been proposed to accommodate for the fact that a single training example may have different importance in different feature spaces, in contrast to Multiple Kernel Learning (MKL) that assigns the same weight to all examples in one feature space. However, the AKM approach is limited to small datasets due to its memory requirements. An alternative way to fuse information from different feature channels is classifier fusion (ensemble methods). There is a significant amount of work on linear programming formulations of classifier fusion (CF) in the case of binary classification. In this paper we derive primal and dual of AKM to draw its correspondence with CF. We propose a multiclass extension of binary ν-LPBoost, which learns the contribution of each class in each feature channel. Existing approaches of CF promote sparse features combinations, due to regularization based on 1-norm, and lead to a selection of a subset of feature channels, which is not good in case of informative channels. We also generalize existing CF formulations to arbitrary p-norm for binary and multiclass problems which results in more effective use of complementary information. We carry out an extensive comparison and show that the proposed nonlinear CF schemes outperform its sparse counterpart as well as state-of-the-art MKL approaches.
2015 International Joint Conference on Neural Networks (IJCNN), 2015
Classification of large amount of images calls for diverse types of features, but employing all possible feature types will create unnecessary computation burden, and may result in reduced classification accuracy. Selecting feature vectors individually is not a feasible solution in this scenario due to the high amount of feature vectors needed for reasonable performance. Instead, this paper proposes a measure that effectively evaluates the relative significance of a feature group, employing the minimum redundancy maximum relevance (mRMR) feature selection. Multiple kernel learning (MKL) is used for combining different feature types in classification, which implicitly also serves an alternative way for weighing the feature groups' importance. Results show the proposed group feature selection better reflects a feature type's importance, and improve upon MKL performance. This study also finds that the convolutional neural network (CNN) features have the best discriminative power among all features, but it is still possible to improve classification accuracy with other well-designed features. I.
Neurocomputing, 2018
In this paper, we propose a new classifier named kernel group sparse representation via structural and non-convex constraints (KGSRSN) for image recognition. The new approach integrates both group sparsity and structure locality in the kernel feature space and then penalties a non-convex function to the representation coefficients. On the one hand, by mapping the training samples into the kernel space, the so-called norm normalization problem will be naturally alleviated. On the other hand, an interval for the parameter of penalty function is provided to promote more sparsity without sacrificing the uniqueness of the solution and robustness of convex optimization. Our method is computationally efficient due to the utilization of the Alternating Direction Method of Multipliers (ADMM) and Majorization-Minimization (MM). Experimental results on three real-world benchmark datasets, i.e., AR face database, PIE face database and MNIST handwritten digits database, demonstrate that KGSRSN can achieve more discriminative sparse coefficients, and it outperforms many state-of-the-art approaches for classification with respect to both recognition rates and running time.
The sparse representation-based classification algorithm has been used for human face recognition. But an image database was restricted to human frontal faces with only slight illumination and expression changes. Cropping and normalization of the face needs to be done beforehand. This paper uses a sparse representation-based algorithm for generic image classification with some intra-class variations and background clutter. A hierarchical framework based on the sparse representation is developed which flexibly combines different global and local features. Experiments with the hierarchical framework on 25 object categories selected from the Caltech101 dataset show that exploiting the advantage of local features with the hierarchical framework improves the classification performance and that the framework is robust to image occlusions, background clutter, and viewpoint changes.
Procedings of the British Machine Vision Conference 2011, 2011
Sparse representation was originally used in signal processing as a powerful tool for acquiring, representing and compressing high-dimensional signals. Recently, motivated by the great successes it has achieved, it has become a hot research topic in the domain of computer vision and pattern recognition. In this paper, we propose to adapt sparse representation to the problem of Visual Object Categorization which aims at predicting whether at least one or several objects of some given categories are present in an image. Thus, we have elaborated a reconstructive and discriminative sparse representation of images, which incorporates a discriminative term, such as Fisher discriminative measure or the output of a SVM classifier, into the standard sparse representation objective function in order to learn a reconstructive and discriminative dictionary. Experiments carried out on the SIMPLIcity image dataset have clearly revealed that our reconstructive and discriminative approach has gained an obvious improvement of the classification accuracy compared to standard SVM using image features as input. Moreover, the results have shown that our approach is more efficient than a sparse representation being only reconstructive, which indicates that adding a discriminative term for constructing the sparse representation is more suitable for the categorization purpose.
SPIE Proceedings, 2013
Sparse coding technique is usually applied for feature representation. To learn discriminative features for visual recognition, a dictionary learning method, called Paired Discriminative K-SVD (PD-KSVD), is presented in this paper. Firstly, to reduce the reconstruction error of positive class while increasing the errors of negative classes, the scheme inverted signal is applied to the negative training samples. Then, the class-specific sub-dictionaries are learned from pairs of positive and negative classes to jointly achieve high discrimination and low reconstruction errors for sparse coding. Multiple sub-dictionaries are concatenated with respect to the same negative class so that the non-zero sparse coefficients can be discriminatively distributed to improve classification accuracy. Last, sparse coefficients are solved via the concatenated sub-dictionaries and used to train the classifier. Compared to the existing dictionary learning methods, PD-KSVD method achieves superior performance in a variety of visual recognition tasks on several publicly available datasets.
2000
In order to achieve good performance in image annotation tasks, it is necessary to com- bine information from various image features. In our submission, we applied the non- sparse multiple kernel learning for feature combination proposed by Kloft et al.(2009) to the ImageCLEF2009 photo annotation data. Since some of the concepts of the Im- ageCLEF task are rather abstract, we
2010 IEEE International Conference on Image Processing, 2010
Recently, increasing interest has been brought to improve image categorization performances by combining multiple descriptors. However, very few approaches have been proposed for combining features based on complementary aspects, and evaluating the performances in realistic databases. In this paper, we tackle the problem of combining different feature types (edge and color), and evaluate the performance gain in the very challenging VOC 2009 benchmark. Our contribution is threefold. First, we propose new local color descriptors, unifying edge and color feature extraction into the "Bag Of Word" model. Second, we improve the Spatial Pyramid Matching (SPM) scheme for better incorporating spatial information into the similarity measurement. Last but not least, we propose a new combination strategy based on 1 Multiple Kernel Learning (MKL) that simultaneously learns individual kernel parameters and the kernel combination. Experiments prove the relevance of the proposed approach, which outperforms baseline combination methods while being computationally effective.
IEEE Transactions on Image Processing, 2014
In complex visual recognition tasks it is typical to adopt multiple descriptors, that describe different aspects of the images, for obtaining an improved recognition performance. Descriptors that have diverse forms can be fused into a unified feature space in a principled manner using kernel methods. Sparse models that generalize well to the test data can be learned in the unified kernel space, and appropriate constraints can be incorporated for application in supervised and unsupervised learning. In this paper, we propose to perform sparse coding and dictionary learning in the multiple kernel space, where the weights of the ensemble kernel are tuned based on graph-embedding principles such that class discrimination is maximized. In our proposed algorithm, dictionaries are inferred using multiple levels of 1−D subspace clustering in the kernel space, and the sparse codes are obtained using a simple levelwise pursuit scheme. Empirical results for object recognition and image clustering show that our algorithm outperforms existing sparse coding based approaches, and compares favorably to other state-of-the-art methods.
IEEE transactions on pattern analysis and machine intelligence, 2014
Multiple kernel learning (MKL) is a principled approach for selecting and combining kernels for a given recognition task. A number of studies have shown that MKL is a useful tool for object recognition, where each image is represented by multiple sets of features and MKL is applied to combine different feature sets. We review the state-of-the-art for MKL, including different formulations and algorithms for solving the related optimization problems, with the focus on their applications to object recognition. One dilemma faced by practitioners interested in using MKL for object recognition is that different studies often provide conflicting results about the effectiveness and efficiency of MKL. To resolve this, we conduct extensive experiments on standard datasets to evaluate various approaches to MKL for object recognition. We argue that the seemingly contradictory conclusions offered by studies are due to different experimental setups. The conclusions of our study are: (i) given a s...
2012 IEEE 12th International Conference on Data Mining, 2012
Sparse representation involves two relevant procedures -sparse coding and dictionary learning. Learning a dictionary from data provides a concise knowledge representation. Learning a dictionary in a higher feature space might allow a better representation of a signal. However, it is usually computationally expensive to learn a dictionary if the numbers of training data and(or) dimensions are very large using existing algorithms. In this paper, we propose a kernel dictionary learning framework for three models. We reveal that the optimization has dimension-free and parallel properties. We devise fast active-set algorithms for this framework. We investigated their performance on classification. Experimental results show that our kernel sparse representation approaches can obtain better accuracy than their linear counterparts. Furthermore, our active-set algorithms are faster than the existing interior-point and proximal algorithms.
Lecture Notes in Computer Science, 2010
The Support Kernel Machine (SKM) and the Relevance Kernel Machine (RKM) are two principles for selectively combining objectrepresentation modalities of different kinds by means of incorporating supervised selectivity into the classical kernel-based SVM. The former principle consists in rigidly selecting a subset of presumably informative support kernels and excluding the others, whereas the latter one assigns positive weights to all of them. The RKM algorithm was fully elaborated in previous publications; however the previous algorithm implementing the SKM principle of selectivity supervision is applicable only to real-valued features. The present paper fills in this gap by harnessing the framework of subdifferential calculus for computationally solving the problem of constrained nondifferentiable convex optimization that occurs in the SKM training criterion applicable to arbitrary kernel-based modalities of object representation.
This paper 1 presents novel algorithms and applications for a particular class of mixed-norm regularization based Multiple Kernel Learning (MKL) formulations. The formulations assume that the given kernels are grouped and employ l 1 norm regularization for promoting sparsity within RKHS norms of each group and l s , s ≥ 2 norm regularization for promoting non-sparse combinations across groups. Various sparsity levels in combining the kernels can be achieved by varying the grouping of kernels-hence we name the formulations as Variable Sparsity Kernel Learning (VSKL) formulations. While previous attempts have a non-convex formulation, here we present a convex formulation which admits efficient Mirror-Descent (MD) based solving techniques. The proposed MD based algorithm optimizes over product of simplices and has a computational complexity of O m 2 n tot log n max /ε 2 where m is no. training data points, n max , n tot are the maximum no. kernels in any group, total no. kernels respectively and ε is the error in approximating the objective. A detailed proof of convergence of the algorithm is also presented. Experimental results show that the VSKL formulations are well-suited for multi-modal learning tasks like object categorization. Results also show that the MD based algorithm outperforms state-of-the-art MKL solvers in terms of computational efficiency.
Lecture Notes in Computer Science, 2011
In this paper, we propose an efficient sparse feature on-line learning approach for image classification. A large-margin formulation solved by linear programming is adopted to learn sparse features on the max-similarity based image representation. The margins between the training images and the query images can be directly utilized for classification by the Naive-Bayes or the K Nearest Neighbor category classifier. Balancing between efficiency and classification accuracy is the most attractive characteristic of our approach. Efficiency lies in its on-line sparsity learning algorithm and direct usage of margins, while accuracy depends on the discriminative power of selected sparse features with their weights. We test our approach using much fewer features on Caltech-101 and Scene-15 datasets and our classification results are comparable to the state-of-the-art.
Bag-of-words-based image classification approaches mostly rely on low level local shape features. However, it has been shown that combining multiple cues such as color, texture, or shape is a challenging and promising task which can improve the classification accuracy. Most of the stateof-the-art feature fusion methods usually aim to weight the cues without considering their statistical dependence in the application at hand. In this paper, we present a new logistic regression-based fusion method, called LRFF, which takes advantage of the different cues without being tied to any of them. We also design a new marginalized kernel by making use of the output of the regression model. We show that such kernels, surprisingly ignored so far by the computer vision community, are particularly well suited to achieve image classification tasks. We compare our approach with existing methods that combine color and shape on three datasets. The proposed learning-based feature fusion process clearly outperforms the state-of-the art fusion methods for image classification.
EURASIP Journal on Image and Video Processing, 2017
Real-world image classification, which aims to determine the semantic class of un-labeled images, is a challenging task. In this paper, we focus on two challenges of image classification and propose a method to address both of them simultaneously. The first challenge is that representing images by heterogeneous features, such as color, shape and texture, helps to provide better classification accuracy. The second challenge comes from dissimilarities in the visual appearance of images from the same class (intra class variance) and similarities between images from different classes (inter class relationship). In addition to these two challenges, we should note that the feature space of real-world images is highly complex so they cannot be linearly classified. The kernel trick is efficacious to classify them. This paper proposes a feature fusion based multiple kernel learning (MKL) model for image classification. By using multiple kernels extracted from multiple features, we address the first challenge. To provide a solution for the second challenge, we use the idea of a localized MKL by assigning separate local weights to each kernel. We employed spatial pyramid match (SPM) representation of images and computed kernel weights based on Χ 2 kernel. Experimental results demonstrate that our proposed model has achieved promising results.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.