Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2002, Systems and Computers in Japan
…
15 pages
1 file
A new pattern classification method called the Kernel-based Nonlinear Subspace (KNS) method is proposed. It implements a subspace method in a high-dimensional nonlinear space by a nonlinear transformation defined by kernel functions. The Support Vector Machine, a recent popular current research topic, is a nonlinear classification method employing kernel functions and has advanced classification performance. However, as the number of patterns and the number of classes increase, the problem faced is an explosive increase in the computational complexity needed for learning. Conventional subspace methods are effective classifiers of multiple classes and are fast classification techniques. But satisfactory classification performance is not achieved if the pattern distribution is nonlinear or if the dimensionality of the feature space is small compared to the number of classes. The proposed method combines the advantages of both techniques to compensate for each others deficiencies to realize nonlinear classification of multiple classes with advanced classification performance and low computational complexity. In this paper, we demonstrate the ability to use nonlinear transforms defined by kernel functions to formulate the nonlinear subspace method, evaluate the proposed method from the perspectives of classification performance for nonlinear distributions and multiclass distributions, the stability of the classification performance with respect to parameter variations, and the computational costs needed for learning and classification, and verify the superiority of the proposed method over conventional methods.
IEEE Transactions on Systems, Man and Cybernetics, Part B (Cybernetics), 2000
The common vector (CV) method is a linear subspace classifier method which allows one to discriminate between classes of data sets, such as those arising in image and word recognition. This method utilizes subspaces that represent classes during classification. Each subspace is modeled such that common features of all samples in the corresponding class are extracted. To accomplish this goal, the method eliminates features that are in the direction of the eigenvectors corresponding to the nonzero eigenvalues of the covariance matrix of each class. In this paper, we introduce a variation of the CV method, which will be referred to as the modified CV (MCV) method. Then, a novel approach is proposed to apply the MCV method in a nonlinearly mapped higher dimensional feature space. In this approach, all samples are mapped into a higher dimensional feature space using a kernel mapping function, and then, the MCV method is applied in the mapped space. Under certain conditions, each class gives rise to a unique CV, and the method guarantees a 100% recognition rate with respect to the training set data. Moreover, experiments with several test cases also show that the generalization performance of the proposed kernel method is comparable to the generalization performances of other linear subspace classifier methods as well as the kernel-based nonlinear subspace method. While both the MCV method and its kernel counterpart did not outperform the support vector machine (SVM) classifier in most of the reported experiments, the application of our proposed methods is simpler than that of the multiclass SVM classifier. In addition, it is not necessary to adjust any parameters in our approach. Index Terms-Common vector (CV), kernel-based subspace method, pattern recognition, subspace classifier. I. INTRODUCTION T HE LINEAR subspace classifiers are pattern recognition methods, which use a linear subspace for each class [1]. The motivation behind the subspace classifiers is the optimal reconstruction of multidimensional data with linear principal components that carry the most significant representative features. Therefore, the most conspicuous features are extracted from each class by using the corresponding training samples in the hope that those features also carry the most important discriminatory information. Although this assumption is seldom Manuscript
Lecture Notes in Computer Science
Subspace classifiers are well-known in pattern recognition, which represent pattern classes by linear subspaces spanned by the class specific basis vectors through simple mathematical operations like SVD. Recently, kernel based subspace methods have been proposed to extend the functionalities by directly applying the Kernel Principal Component Analysis (KPCA). The projection variance in kernel space as applied in these earlier proposed kernel subspace methods, however, is not a trustworthy criteria for class discrimination and they simply fail in many recognition problems as we encountered in biometrics research. We address this issue by proposing a learning kernel subspace classifier which attempts to reconstruct data in input space through the kernel subspace projection. While the pre-image methods aiming at finding an approximate pre-image for each input by minimization of the reconstruction error in kernel space, we emphasize the problem of how to estimate a kernel subspace as a model for a specific class. Using the occluded face recognition as examples, our experimental results demonstrated the efficiency of the proposed method.
Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining - KDD 08, 2008
Kernel methods have been applied successfully in many data mining tasks. Subspace kernel learning was recently proposed to discover an effective low-dimensional subspace of a kernel feature space for improved classification. In this paper, we propose to construct a subspace kernel using the Hilbert-Schmidt Independence Criterion (HSIC). We show that the optimal subspace kernel can be obtained efficiently by solving an eigenvalue problem. One limitation of the existing subspace kernel learning formulations is that the kernel learning and classification are independent and the subspace kernel may not be optimally adapted for classification. To overcome this limitation, we propose a joint optimization framework, in which we learn the subspace kernel and subsequent classifiers simultaneously. In addition, we propose a novel learning formulation that extracts an uncorrelated subspace kernel to reduce the redundant information in a subspace kernel. Following the idea from multiple kernel learning, we extend the proposed formulations to the case when multiple kernels are available and need to be combined. We show that the integration of subspace kernels can be formulated as a semidefinite program (SDP) which is computationally expensive. To improve the efficiency of the SDP formulation, we propose an equivalent semi-infinite linear program (SILP) formulation which can be solved efficiently by the column generation technique. Experimental results on a collection of benchmark data sets demonstrate the effectiveness of the proposed algorithms.
IEEE Transactions on Neural Networks, 2001
The eigenstructure of the second-order statistics of a multivariate random population can be inferred from the matrix of pairwise combinations of inner products of the samples. Therefore, it can be also efficiently obtained in the implicit, high-dimensional feature spaces defined by kernel functions. We elaborate on this property to obtain general expressions for immediate derivation of nonlinear counterparts of a number of standard pattern analysis algorithms, including principal component analysis, data compression and denoising, and Fisher's discriminant. The connection between kernel methods and nonparametric density estimation is also illustrated. Using these results we introduce the kernel version of Mahalanobis distance, which originates nonparametric models with unexpected and interesting properties, and also propose a kernel version of the minimum squared error (MSE) linear discriminant function. This learning machine is particularly simple and includes a number of generalized linear models such as the potential functions method or the radial basis function (RBF) network. Our results shed some light on the relative merit of feature spaces and inductive bias in the remarkable generalization properties of the support vector machine (SVM). Although in most situations the SVM obtains the lowest error rates, exhaustive experiments with synthetic and natural data show that simple kernel machines based on pseudoinversion are competitive in problems with appreciable class overlapping
International Conference on Acoustics, Speech, and Signal Processing, 2003
Support vector machines (SVMs) are the most well known nonlinear classifiers based on the Mercer kernel trick. They generally lead to very sparse solutions that ensure good generalization performance. Recently, S. Mika et al. (see Advances in Neural Networks for Signal Processing, p.41-8, 1999) proposed a new nonlinear technique based on the kernel trick and the Fisher criterion: the nonlinear
Pattern Recognition, 2006
Nonlinear discriminant analysis may be transformed into the form of kernel-based discriminant analysis. Thus, the corresponding discriminant direction can be solved by linear equations. From the view of feature space, the nonlinear discriminant analysis is still a linear method, and it is provable that in feature space the method is equivalent to Fisher discriminant analysis. We consider that one linear combination of parts of training samples, called "significant nodes", can replace the total training samples to express the corresponding discriminant vector in feature space to some extent. In this paper, an efficient algorithm is proposed to determine "significant nodes" one by one. The principle of determining "significant nodes" is simple and reasonable, and the consequent algorithm can be carried out with acceptable computation cost. Depending on the kernel functions between test samples and all "significant nodes", classification can be implemented. The proposed method is called fast kernel-based nonlinear method (FKNM). It is noticeable that the number of "significant nodes" may be much smaller than that of the total training samples. As a result, for two-class classification problems, the FKNM will be much more efficient than the naive kernel-based nonlinear method (NKNM). The FKNM can be also applied to multi-class via two approaches: one-against-the-rest and one-against-one. Although there is a view that one-against-one is superior to one-against-the-rest in classification efficiency, it seems that for the FKNM one-against-the-rest is more efficient than one-against-one. Experiments on benchmark and real datasets illustrate that, for two-class and multi-class classifications, the FKNM is effective, feasible and much efficient.
Springer Proceedings in Mathematics & Statistics, 2016
The support vector machine for linear and nonlinear classification of data is studied. The notion of generalized support vector machine for data classifications is used. The problem of generalized support vector machine is shown to be equivalent to the problem of generalized variational inequality and various results for the existence of solutions are established. Moreover, examples supporting the results are provided. Keywords Linear and nonlinear classification • Support vector machine • Generalized support vector machine • Kernel function 1 Support Vector Machine Support vector machines (SVM) [2, 3, 13, 14, 18] were developed by Vapnik et al. (1995) and are gaining popularity due to many attractive features. As a very powerful tool for data classification and regression, it has been used in many fields, such as text classification [5], facial expression recognition [9], gene analysis [4] and many others [1, 6-8, 10-12, 17, 19-22]. Recently, it has been used for faults classification in a water level control system [15]. And a faults classifier based SVM is used to diagnose the faults for a water level control process [16].
2004
This paper investigates the effect of Kernel Principal Component Analysis (KPCA) within the classification framework, essentially the regularization properties of this dimensionality reduction method. KPCA has been previously used as a pre-processing step before applying an SVM but we point out that this method is somewhat redundant from a regularization point of view and we propose a new algorithm called Kernel Projection Machine to avoid this redundancy, based on an analogy with the statistical framework of regression for a Gaussian white noise model. Preliminary experimental results show that this algorithm reaches the same performances as an SVM.
Proceedings of 2004 International Conference on Machine Learning and Cybernetics (IEEE Cat. No.04EX826), 2004
Support vector machines (SVMs) are powerful tools for providing solutions to classifcation and function approximatiun problem. In this paper, the comparison among the four cMication methods is conducted. The four methods BIP Lagrangian Support Vector Machine (LSVM), Finite Newton Lagrangian Support Vector Machine (NLSVM), Smooth Support Vector Machine (SSVM) and Finite Newton Support Vector Machine (NSVM). The comparison of their Algorithm in generating a linear or nonlinear kernel classifier, accuracy and computational complexity is also given. The study provides some guidelines for choosing an appropriate one from four SVM classification methods in a classification problem.
… -INTERNATIONAL WORKSHOP THEN …, 2003
A new method for classification is proposed. This is based on kernel orthonormalized partial least squares (PLS) dimensionality reduction of the original data space followed by a support vector classifier. Unlike principal component analysis (PCA), which has previously served as a dimension reduction step for discrimination problems, orthonormalized PLS is closely related to Fisher's approach to linear discrimination or equivalently to canonical correlation analysis. For this reason orthonormalized PLS is preferable to PCA for discrimination. Good behavior of the proposed method is demonstrated on 13 different benchmark data sets and on the real world problem of classifying finger movement periods from non-movement periods based on electroencephalograms.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.
18th International Conference on Pattern Recognition (ICPR'06), 2006
Computer Science and Information Technologies, 2023
Data Mining and Knowledge Discovery, 1998
Bulletin of Electrical Engineering and Informatics, 2021
2008 International Conference on Information and Automation, 2008
International Journal of Advanced Computer Science and Applications, 2020
… , 2008. IMCSIT 2008. …, 2008
IEEE Transactions on Systems, Man and Cybernetics, Part B (Cybernetics), 2005
Journal of Signal Processing Systems, 2011
International Journal of Information Technology, 2018
The First Asian Conference on Pattern Recognition, 2011
Wiley Encyclopedia of Operations Research and Management Science, 2010
Journal of Science and Arts
IEEE Transactions on Neural Networks, 2006
Proceedings of the International Conference on Fuzzy Computation and 2nd International Conference on Neural Computation, 2010