Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2003, Proceedings of the Third IEEE International …
Discriminant analysis is known to learn discriminativefeature transformations. This paper studies its use in multi-classclassification problems. The performance is tested ona large collection of benchmark datasets. ... Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references. ... C. Blake and C. Merz. UCI repository of machine learning databases, 1998. ... GJ McLachlan. Discriminant Analysis and Statistical Pattern Recognition. John Wiley & ...
2006
Abstract Many supervised machine learning tasks can be cast as multi-class classification problems. Support vector machines (SVMs) excel at binary classification problems, but the elegant theory behind large-margin hyperplane cannot be easily extended to their multi-class counterparts. On the other hand, it was shown that the decision hyperplanes for binary classification obtained by SVMs are equivalent to the solutions obtained by Fisher's linear discriminant on the set of support vectors.
International Journal of Strategic Decision Sciences, 2010
New linear programming approaches are proposed as nonparametric procedures for multiple-class discriminant and classification analysis. A new MSD model minimizing the sum of the classification errors is formulated to construct discriminant functions. This model has desirable properties because it is versatile and is immune to the pathologies of some of the earlier mathematical programming models for two-class classification. It is also purely systematic and algorithmic and no user ad hoc and trial judgment is required. Furthermore, it can be used as the basis to develop other models, such as a multiple-class support vector machine and a mixed integer programming model, for discrimination and classification. A MMD model minimizing the maximum of the classification errors, although with very limited use, is also studied. These models may also be considered as generalizations of mathematical programming formulations for two-class classification. By the same approach, other mathematical...
Pattern Recognition, 2006
Nonlinear discriminant analysis may be transformed into the form of kernel-based discriminant analysis. Thus, the corresponding discriminant direction can be solved by linear equations. From the view of feature space, the nonlinear discriminant analysis is still a linear method, and it is provable that in feature space the method is equivalent to Fisher discriminant analysis. We consider that one linear combination of parts of training samples, called "significant nodes", can replace the total training samples to express the corresponding discriminant vector in feature space to some extent. In this paper, an efficient algorithm is proposed to determine "significant nodes" one by one. The principle of determining "significant nodes" is simple and reasonable, and the consequent algorithm can be carried out with acceptable computation cost. Depending on the kernel functions between test samples and all "significant nodes", classification can be implemented. The proposed method is called fast kernel-based nonlinear method (FKNM). It is noticeable that the number of "significant nodes" may be much smaller than that of the total training samples. As a result, for two-class classification problems, the FKNM will be much more efficient than the naive kernel-based nonlinear method (NKNM). The FKNM can be also applied to multi-class via two approaches: one-against-the-rest and one-against-one. Although there is a view that one-against-one is superior to one-against-the-rest in classification efficiency, it seems that for the FKNM one-against-the-rest is more efficient than one-against-one. Experiments on benchmark and real datasets illustrate that, for two-class and multi-class classifications, the FKNM is effective, feasible and much efficient.
IEEE Signal Processing Letters, 2000
An alternative nonlinear multiclass discriminant algorithm is presented. This algorithm is based on the use of kernel functions and is designed to optimize a general linear discriminant analysis criterion based on scatter matrices. By reformulating these matrices in a specific form, a straightforward derivation allows the kernel function to be introduced in a simple and direct way. Moreover, we propose a method to determine the value of the regularization parameter , based on this derivation.
Linear Discriminant Analysis (LDA) is a very common technique for dimensionality reduction problems as a pre-processing step for machine learning and pattern classification applications. At the same time, it is usually used as a black box, but (sometimes) not well understood. The aim of this paper is to build a solid intuition for what is LDA, and how LDA works, thus enabling readers of all levels be able to get a better understanding of the LDA and to know how to apply this technique in different applications. The paper first gave the basic definitions and steps of how LDA technique works supported with visual explanations of these steps. Moreover, the two methods of computing the LDA space, i.e. class-dependent and class-independent methods, were explained in details. Then, in a step-by-step approach, two numerical examples are demonstrated to show how the LDA space can be calculated in case of the class-dependent and class-independent methods. Furthermore, two of the most common LDA problems (i.e. Small Sample Size (SSS) and non-linearity problems) were highlighted and illustrated, and state-of-the-art solutions to these problems were investigated and explained. Finally, a number of experiments was conducted with different datasets to (1) investigate the effect of the eigenvectors that used in the LDA space on the robustness of the extracted feature for the classification accuracy, and (2) to show when the SSS problem occurs and how it can be addressed.
Chemometrics and intelligent laboratory systems, 1992
A classification method, linear discriminant classification tree (LDCT), has been developed with particular attention to problem-driven solutions. It consists in the joint application of linear discriminant analysis (LDA) and classification tree methods. The population of each node is partitioned into two groups and classified using LDA which allows the introduction of multivariate binary classifiers. Thus the resulting classification trees are usually characterized by low complexity and ready interpretability. Several different trees can be obtained from the same data set: ...
International Journal of Information Technology & Decision Making, 2011
A mixed integer programming model is proposed for multiple-class discriminant and classification analysis. When multiple discriminant functions, one for each class, are constructed with the mixed integer programming model, the number of misclassified observations in the sample is minimized. This model is an extension of the linear programming models for multiple-class discriminant analysis but may be considered as a generalization of mixed integer programming formulations for two-class classification analysis. Properties of the model are studied. The model is immune from any difficulties of many mathematical programming formulations for two-class classification analysis, such as nonexistence of optimal solutions, improper solutions, and instability under linear data transformation. In addition, meaningful discriminant functions can be generated under conditions where other techniques fail. Examples are provided. Results on publically accessible datasets show that this model is very ...
The aim of this paper is to collect in one place the basic background needed to understand the discriminant analysis (DA) classifier to make the reader of all levels be able to get a better understanding of the DA and to know how to apply this classifier in different applications. This paper starts with basic mathematical definitions of the DA steps with visual explanations of these steps. Moreover, in a step-by-step approach, a number of numerical examples were illustrated to show how to calculate the discriminant functions and decision boundaries when the covariance matrices of all classes were common or not. The singularity problem of DA was explained and some of the state-of-the-art solutions to this problem were highlighted with numerical illustrations. An experiment is conducted to compare between the linear and quadratic classifiers and to show how to solve the singularity problem when high-dimensional datasets are used. Reference to this paper should be made as follows: Tharwat, A. (2016) 'Linear vs. quadratic discriminant analysis classifier: a tutorial', Int.
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2006
Over the years, many Discriminant Analysis (DA) algorithms have been proposed for the study of high-dimensional data in a large variety of problems. Each of these algorithms is tuned to a specific type of data distribution (that which best models the problem at hand). Unfortunately, in most problems the form of each class pdf is a priori unknown, and the selection of the DA algorithm that best fits our data is done over trial-and-error. Ideally, one would like to have a single formulation which can be used for most distribution types. This can be achieved by approximating the underlying distribution of each class with a mixture of Gaussians. In this approach, the major problem to be addressed is that of determining the optimal number of Gaussians per class, i.e., the number of subclasses. In this paper, two criteria able to find the most convenient division of each class into a set of subclasses are derived. Extensive experimental results are shown using five databases. Comparisons are given against Linear Discriminant Analysis (LDA), Direct LDA (DLDA), Heteroscedastic LDA (HLDA), Nonparametric DA (NDA), and Kernel-Based LDA (K-LDA). We show that our method is always the best or comparable to the best.
Journal of Information Science and Engineering, 2005
In this study, an approach involving new types of cost functions is given for the construction of discriminant functions. Centers of mass, not specified a priori, around feature vectors are clustered using cost function. Thus, the algorithms yield both the centers of mass and the distinct classes.
2004 Conference on Computer Vision and Pattern Recognition Workshop
Discriminant Analysis (DA) has had a big influence in many scientific disciplines. Unfortunately, DA algorithms need to make assumptions on the type of data available and, therefore, are not applicable everywhere. For example, when the data of each class can be represented by a single Gaussian and these share a common covariance matrix, Linear Discriminant Analysis (LDA) is a good option. In other cases, other DA approaches may be preferred. And, unfortunately, there still exist applications where no DA algorithm will correctly represent reality and, therefore, unsupervised techniques, such as Principal Components Analysis (PCA), may perform better. This paper first presents a theoretical study to define when and (most importantly) why DA techniques fail (Section 2). This is then used to create a new DA algorithm that can adapt to the training data available (Sections 2 and 3). The first main component of our solution is to design a method to automatically discover the optimal set of subclasses in each class. We will show that when this is achieved, optimal results can be obtained. The second main component of our algorithm is given by our theoretical study which defines a way to rapidly select the optimal number of subclasses. We present experimental results on two applications (object categorization and face recognition) and show that our method is always comparable or superior to LDA, Direct LDA (DLDA), Nonparametric DA (NDA) and PCA.
Neural Networks, IEEE …, 2010
In this brief we have proposed the multiclass data classification by computationally inexpensive discriminant analysis through vector-valued regularized kernel function approximation (VVRKFA). VVRKFA being an extension of fast regularized kernel function approximation (FRKFA), provides the vector-valued response at single step. The VVRKFA finds a linear operator and a bias vector by using a reduced kernel that maps a pattern from feature space into the low dimensional label space. The classification of patterns is carried out in this low dimensional label subspace. A test pattern is classified depending on its proximity to class centroids. The effectiveness of the proposed method is experimentally verified and compared with multiclass support vector machine (SVM) on several benchmark data sets as well as on gene microarray data for multi-category cancer classification. The results indicate the significant improvement in both training and testing time compared to that of multiclass SVM with comparable testing accuracy principally in large data sets. Experiments in this brief also serve as comparison of performance of VVRKFA with stratified random sampling and sub-sampling.
Pattern Recognition, 2007
Linear discriminant analysis (LDA) has been widely used for dimension reduction of data sets with multiple classes. The LDA has been recently extended to various generalized LDA methods that are applicable regardless of the relative sizes between the data dimension and the number of data items. In this paper, we propose several multiclass classifiers based on generalized LDA algorithms, taking advantage of the dimension reducing transformation matrix without requiring additional training or any parameter optimization. A marginal linear discriminant classifier, a Bayesian linear discriminant classifier, and a one-dimensional Bayesian linear discriminant classifier are introduced for multiclass classification. Our experimental results illustrate that these classifiers produce higher tenfold cross validation accuracy than kNN and centroid based classification in the reduced dimensional space providing efficient general multiclass classifiers.
2009
Linear discriminant analysis (LDA) is one of the well known methods to extract the best features for the multiclass discrimination. Otsu derived the optimal nonlinear discriminant analysis (ONDA) by assuming the underlying probabilities and showed that the ONDA was closely related to Bayesian decision theory (the posterior probabilities). Also Otsu pointed out that LDA could be regarded as a linear approximation of the ONDA through the linear approximations of the Bayesian posterior probabilities. Based on this theory, we propose a novel nonlinear discriminant analysis named logistic discriminant analysis (LgDA) in which the posterior probabilities are estimated by multi-nominal logistic regression (MLR). The experimental results are shown by comparing the discriminant spaces constructed by LgDA and LDA for the standard repository datasets.
Proceedings of the XXXVIII Iberian Latin American Congress on Computational Methods in Engineering, 2017
The problem of ranking features in N-class problems have been addressed by the multi-class discriminant principal component analysis (MDPCA) for texture and face image classification. In this paper we present a nonlinear version of the MDPCA, named multi-class nonlinear discriminant feature analysis (MNDFA), that is based on kernel support vector machines (KSVM) and AdaBoost techniques. Specifically, the problem of ranking features, computed from multi-class databases, is addressed by applying the AdaBoost procedure in a nested loop: each iteration of the inner loop boosts weak classifiers to a moderate one while the outer loop combines the moderate classifiers to build the global discriminant vector. The inner and outer loop procedures use AdaBoost techniques to combine learners. In the proposed MNDFA, each weak learner is a linear classifier computed through a separating hyperplane, defined by a KSVM decision boundary, in the feature space. In the computational experiments we analyse the obtained approach using a five-class granite image database. Our experimental results have shown that the features selected by the proposed technique allow competitive recognition rates when compared with related methods.
Machine Learning: ECML 2004, 2004
2010 20th International Conference on Pattern Recognition, 2010
This paper presents a novel discriminative feature transformation, named full-rank generalized likelihood ratio discriminant analysis (fGLRDA), on the grounds of the likelihood ratio test (LRT). fGLRDA attempts to seek a feature space, which is linearly isomorphic to the original n-dimensional feature space and is characterized by a full-rank) (n n × transformation matrix, under the assumption that all the classdiscrimination information resides in a d-dimensional subspace) (n d < , through making the most confusing situation, described by the null hypothesis, as unlikely as possible to happen without the homoscedastic assumption on class distributions. Our experimental results demonstrate that fGLRDA can yield moderate performance improvements over other existing methods, such as linear discriminant analysis (LDA) for the speaker identification task.
2008
Linear discriminant analysis (LDA) is designed to seek a linear transformation that projects a data set into a lower-dimensional feature space for maximum class geometrical separability. LDA cannot always guarantee better classification accuracy, since its formulation is not in light of the properties of the classifiers, such as the automatic speech recognizer (ASR). In this paper, the relationship between the empirical classification error rates and the Mahalanobis distances of the respective class pairs of speech features is investigated, and based on this, a novel reformulation of the LDA criterion, distance-error coupled LDA (DE-LDA), is proposed. One notable characteristic of DE-LDA is that it can modulate the contribution on the between-class scatter from each class pair through the use of an empirical error function, while preserving the lightweight solvability of LDA. Experiment results seem to demonstrate that DE-LDA yields moderate improvements over LDA on the LVCSR task.
Analytica Chimica Acta, 2010
This work describes multi-classification based on binary probabilistic discriminant partial least squares (p-DPLS) models, developed with the strategy one-against-one and the principle of winner-takes-all. The multi-classification problem is split into binary classification problems with p-DPLS models. The results of these models are combined to obtain the final classification result. The classification criterion uses the specific characteristics of an object (position in the multivariate space and prediction uncertainty) to estimate the reliability of the classification, so that the object is assigned to the class with the highest reliability. This new methodology is tested with the well-known Iris data set and a data set of Italian olive oils. When compared with CART and SIMCA, the proposed method has better average performance of classification, besides giving a statistic that evaluates the reliability of classification. For the olive oil set the average percentage of correct classification for the training set was close to 84% with p-DPLS against 75% with CART and 100% with SIMCA, while for the test set the average was close to 94% with p-DPLS as against 50% with CART and 62% with SIMCA.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.