Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
1999
…
1 page
1 file
A central problem in pattern recognition is the classi cation of a feature vector into one of several possible classes. In polychotomous classi cation the number of possible classes is higher than two. Given a set of training vectors with known classes, a classi er can be constructed using for example some density estimation or regression method. In a typical approach one estimates, using all training data from all classes, the posterior probability functions of each class and then classi es future feature vectors according to the highest estimated posterior probability. In some recent proposals a di erent approach is suggested where one rst estimates classi ers only for each pair of classes and then combines these into a nal polychotomous classi er (Friedman 1996, Hastie and Tibshirani 1998). The advantages of the pairwise approach are hoped to come from both increased classi cation accuracy and reduction in computational complexity. We report experimental results using various cla...
Estimating class membership probabilities is an important step in many automated speech recognition systems. Since binary classifiers are usually easier to train, one common approach to this problem is to construct pairwise binary classifiers. Pairwise models yield an overdetermined system of equations for the class membership probabilities. Motivated by probabilistic arguments we propose a new way for estimating individual class membership probabilities, which reduces to solving a linear system of equations. A solution of this system is obtained by finding the unique non-zero eigenvector of total probability one, corresponding to eigenvalue one of a positive Markov matrix. This is a property shared by another algorithm previously proposed by Wu, Lin, and Weng. We compare properties of these methods in two settings: a theoretical three-way classification problem, and via classification of English monophthongs from TIMIT corpus. Index Terms: binary classifiers; multiclass classification; phoneme recognition; English vowels; TIMIT
2004
Abstract Pairwise coupling is a popular multi-class classification method that combines all comparisons for each pair of classes. This paper presents two approaches for obtaining class probabilities. Both methods can be reduced to linear systems and are easy to implement. We show conceptually and experimentally that the proposed approaches are more stable than the two existing popular methods: voting and the method by Hastie and Tibshirani (1998)
2002
In this paper, we present a new approach to the design of probabilistic classifiers. Rather than working with a common high-dimensional feature vector, the classifier is written in terms of separate feature vectors chosen specifically for each class and their low-dimensional PDFs. While sufficiency is not a requirement, if the feature vectors are sufficient to distinguish the corresponding class from a common (null) hypothesis, the method is equivalent to the maximum a posteriori probability (MAP) classifier. The method has applications to speech, image, and general pattern recognition problems.
IEEE Transactions on Signal Processing, 1999
In this correspondence, we present a new approach to the design of probabilistic classifiers that circumvents the dimensionality problem. Rather than working with a common high-dimensional feature set, the classifier is written in terms of likelihood ratios with respect to a common class using sufficient statistics chosen specifically for each class.
Journal of Computational Science, 2018
The improvement in the performance of classifiers has been the focus of attention of many researchers over the last few decades. Obtaining accurate predictions becomes more complicated as the number of classes increases. Most families of classification techniques generate models that define decision boundaries trying to separate the classes as well as possible. As an alternative, in this paper, we propose to hierarchically decompose the original multiclass problem by reducing the number of classes involved in each local subproblem. This is done by deriving a similarity matrix from the misclassification errors given by a first classifier that is learned for this, and then, using the similarity matrix to build a tree-like hierarchy of specialized classifiers. Then, we present two approaches to solve the multiclass problem: the first one traverses the tree of classifiers in a top-down manner similar to the way some hierarchical classification methods do for dealing with hierarchical domains; the second one is inspired in the way probabilistic decision trees compute class membership probabilities. To improve the efficiency of our methods, we propose a criterion to reduce the size of the hierarchy. We experimentally evaluate all of the proposals on a collection of multiclass datasets showing that, in general, the generated classifier hierarchies outperform the original (flat) multiclass classification.
Artificial Intelligence Review, 2008
Several real problems involve the classification of data into categories or classes. Given a data set containing data whose classes are known, Machine Learning algorithms can be employed for the induction of a classifier able to predict the class of new data from the same domain, performing the desired discrimination. Some learning techniques are originally conceived for the solution of problems with only two classes, also named binary classification problems. However, many problems require the discrimination of examples into more than two categories or classes. This paper presents a survey on the main strategies for the generalization of binary classifiers to problems with more than two classes, known as multiclass classification problems. The focus is on strategies that decompose the original multiclass problem into multiple binary subtasks, whose outputs are combined to obtain the final prediction.
Pattern Recognition Letters, 2007
The use of Receiver Operator Characteristic (ROC) analysis for the sake of model selection and threshold optimisation has become a standard practice for the design of two-class pattern recognition systems. Advantages include decision boundary adaptation to imbalanced misallocation costs, the ability to fix some classification errors, and performance evaluation in imprecise, ill-defined conditions where costs, or prior probabilities may vary. Extending this to the multiclass case has recently become a topic of interest. The primary challenge involved is the computational complexity, that increases to the power of the number of classes, rendering many problems intractable. In this paper the multiclass ROC is formalised, and the computational complexities exposed. A pairwise approach is proposed that approximates the multidimensional operating characteristic by discounting some interactions, resulting in an algorithm that is tractable, and extensible to large numbers of classes. Two additional multiclass optimisation techniques are also proposed that provide a benchmark for the pairwise algorithm. Experiments compare the various approaches in a variety of practical situations, demonstrating the efficacy of the pairwise approach.
The extension of the Dezert-Smarandache theory (DSmT) for the multi-class framework has a feasible computational complexity for various applications when the number of classes is limited or reduced typically two classes. In contrast, when the number of classes is large, the DSmT generates a high computational complexity. This paper proposes to investigate the effective use of the DSmT for multi-class classification in conjunction with the Support Vector Machines using the One-Against-All (OAA) implementation, which allows offering two advantages: firstly, it allows modeling the partial ignorance by including the complementary classes in the set of focal elements during the combination process and, secondly, it allows reducing drastically the number of focal elements using a supervised model by introducing exclusive constraints when classes are naturally and mutually exclusive. To illustrate the effective use of the DSmT for multi-class classification, two SVM-OAA implementations are...
Pattern Recognition Letters, 2016
We consider multi-class classification models built from complete sets of pairwise binary classifiers. The Bradley-Terry model is often used to estimate posterior distributions in this setting. We introduce the notion of Bayes covariance, which holds if the multi-class classifier respects multiplicative group action on class priors. As a consequence, a Bayes covariant method yields the same result whether new priors are considered before or after combination of the individual classifiers, which has several practical advantages for systems with feedback. In the paper , we construct a Bayes covariant combining method and compare it with previously published methods in both Monte Carlo simulations as well as on a practical speech frame recognition task.
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2006
We present a new method of multiclass classification based on the combination of one-vs-all method and a modification of one-vs-one method. This combination of one-vs-all and one-vs-one methods proposed enforces the strength of both methods. A study of the behavior of the two methods identifies some of the sources of their failure. The performance of a classifier can be improved if the two methods are combined in one, in such a way that the main sources of their failure are partially avoided.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.
Proceedings of Machine Learning Research , 2017
Journal of the American Statistical Association, 2010
IFIP Advances in Information and Communication Technology, 2012
ESAIM: Probability and Statistics, 2005
Pattern Recognition, 2012
IEEE Transactions on Neural Networks, 2004
Knowledge Discovery and Data Mining, 1996
Neurocomputing, 2003
Pattern Analysis and Applications, 2005
Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170), 1998
Systems, Man, and Cybernetics, Part B: …, 2002