Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
In many supervised learning applications, the existence of additional information in training data is very common. Recently, Vapnik introduced a new method called LUPI which provides a learning paradigm under privileged (or additional) information. It describes the SVM+ technique to process this information in batch mode. Following this method, we apply the approach to deal with additional information by conformal predictors. An application to a medical diagnostic problem is considered and the results are reported.
2021
Conformal Predictors (CP) are wrappers around ML models, providing error guarantees under weak assumptions on the data distribution. They are suitable for a wide range of problems, from classification and regression to anomaly detection. Unfortunately, their very high computational complexity limits their applicability to large datasets. In this work, we show that it is possible to speed up a CP classifier considerably, by studying it in conjunction with the underlying ML method, and by exploiting incremental&decremental learning. For methods such as k-NN, KDE, and kernel LSSVM, our approach reduces the running time by one order of magnitude, whilst producing exact solutions. With similar ideas, we also achieve a linear speed up for the harder case of bootstrapping. Finally, we extend these techniques to improve upon an optimization of k-NN CP for regression. We evaluate our findings empirically, and discuss when methods are suitable for CP optimization.
ArXiv, 2021
The property of conformal predictors to guarantee the required accuracy rate makes this framework attractive in various practical applications. However, this property is achieved at a price of reduction in precision. In the case of conformal classification, the systems can output multiple class labels instead of one. It is also known from the literature, that the choice of nonconformity function has a major impact on the efficiency of conformal classifiers. Recently, it was shown that different model-agnostic nonconformity functions result in conformal classifiers with different characteristics. For a Neural Network-based conformal classifier, the inverse probability (or hinge loss) allows minimizing the average number of predicted labels, and margin results in a larger fraction of singleton predictions. In this work, we aim to further extend this study. We perform an experimental evaluation using 8 different classification algorithms and discuss when the previously observed relatio...
IFIP Advances in Information and Communication Technology, 2011
HAL is a multidisciplinary open access archive for the deposit and dissemination of scientific research documents, whether they are published or not. The documents may come from teaching and research institutions in France or abroad, or from public or private research centers. L'archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des établissements d'enseignement et de recherche français ou étrangers, des laboratoires publics ou privés.
2004
Class imbalance is a widespread problem in many classification tasks such as medical diagnosis and text categorization. To overcome this problem, we investigate one-class SVMs which can be trained to differentiate two classes on the basis of examples from a single class. We propose an improvement of one-class SVMs via a conformal kernel transformation as described in the context of binary SVM classifiers by [2,3]. We tested this improved one-class SVM on a health care problem that involves discriminating 11% nosocomially infected patients from 89% non infected patients. The results obtained are encouraging: compared with three other SVM-based approaches to coping with class imbalance, one-class SVMs achieved the highest sensitivity recorded so far on the nosocomial infection dataset. However, the price to pay is a concomitant decrease specificity, and it is for domain experts to decide the proportion of false positive cases they are willing to accept in order to ensure treatment of all infected patients.
Pattern Recognition, 2022
2021
We use influence functions from robust statistics to speed up full conformal prediction. Traditionally, conformal prediction requires retraining multiple leave-one-out classifiers to calculate p-values for each test point. By using influence functions, we are able to approximate this procedure and to speed up considerably the time complexity.
2010
The Conformal Predictions framework is a recent development in machine learning to associate reliable measures of confidence with results in classification and regression. This framework is founded on the principles of algorithmic randomness (Kolmogorov complexity), transductive inference and hypothesis testing. While the formulation of the framework guarantees validity, the efficiency of the framework depends greatly on the choice of the classifier and appropriate kernel functions or parameters. While this framework has ...
2015
The report summarises some preliminary findings of WP1.4: Confidence Estimation and feature significance. It presents an application of conformal predictors in transductive and inductive modes to the large, high-dimensional, sparse and imbalanced data sets found in Compound Activity Prediction from PubChem public repository. The report describes a version of conformal predictors called Mondrian Predictor that keeps validity guarantees for each class. The experiments were conducted using several non-conformity measures extracted from underlying algorithms such as SVM, Nearest Neighbours and Näıve Bayes. The results show (1) that Inductive Conformal Mondrian Prediction framework is quick and effective for large imbalanced data and (2) that its less strict i.i.d. requirements combine well with training set editing algorithms such as Cascade SVM. Among the algorithms tested with the Mondrian ICP framework, Cascade SVM with Tanimoto+RBF kernel appeared to be best performing one, if the q...
2021
The property of conformal predictors to guarantee the required accuracy rate makes this framework attractive in various practical applications. However, this property is achieved at a price of reduction in precision. In the case of conformal classification, the system can output multiple class labels instead of one. It is also known, that the choice of nonconformity function has a major impact on the efficiency of conformal classifiers. Recently, it was shown that different model-agnostic nonconformity functions result in conformal classifiers with different characteristics. For a Neural Network-based conformal classifier, the inverse probability (or hinge loss) allows minimizing the average number of predicted labels, and margin results in a larger fraction of singleton predictions. In this work, we aim to further extend this study. We perform an experimental evaluation using 8 different classification algorithms and discuss when the previously observed relationship holds or not. A...
2011
Computer-aided decision support systems enable physicians to make more accurate clinical decisions and can significantly improve the quality of care provided to patients. However, prediction of classification confidence as the degree of reliability on the resulting predictions is a much needed step in clinical decision making. A recently developed technique called conformal prediction utilizes the similarity between a new sample and the training samples in order to form confidence measures for predictions. However, the conventional conformal prediction method suffers from shortcomings such as high computational complexity that prevent its use in real-time applications. This paper introduces an alternative approach to the conventional confidence prediction that addresses some of this and other disadvantages. Both real clinical and non-clinical datasets are employed to test and validate the capabilities of the proposed approach.
2019
In real-world scenarios, interpretable models are often required to explain predictions, and to allow for inspection and analysis of the model. The overall purpose of oracle coaching is to produce highly accurate, but interpretable, models optimized for a specific test set. Oracle coaching is applicable to the very common scenario where explanations and insights are needed for a specific batch of predictions, and the input vectors for this test set are available when building the predictive model. In this paper, oracle coaching is used for generating underlying classifiers for conformal prediction. The resulting conformal classifiers output valid label sets, i.e., the error rate on the test data is bounded by a preset significance level, as long as the labeled data used for calibration is exchangeable with the test set. Since validity is guaranteed for all conformal predictors, the key performance metric is efficiency, i.e., the size of the label sets, where smaller sets are more in...
This paper proposes a new method of probabilistic prediction, which is based on conformal prediction. The method is applied to the standard USPS data set and gives encouraging results.
Journal of Machine Learning Research, 2007
praktiqeskie vyvody teorii vero tnoste mogut byt obosnovany v kaqestve sledstvi gipotez o predel no pri dannyh ograniqeni h slo nosti izuqaemyh vleni Abstract Conformal prediction uses past experience to determine precise levels of confidence in new predictions. Given an error probability , together with a method that makes a predictionŷ of a label y, it produces a set of labels, typically containingŷ, that also contains y with probability 1 − . Conformal prediction can be applied to any method for producingŷ: a nearest-neighbor method, a support-vector machine, ridge regression, etc.
Journal of Artificial Intelligence Research, 2011
In this paper we apply Conformal Prediction (CP) to the k -Nearest Neighbours Regression (k -NNR) algorithm and propose ways of extending the typical nonconformity measure used for regression so far. Unlike traditional regression methods which produce point predictions, Conformal Predictors output predictive regions that satisfy a given confidence level. The regions produced by any Conformal Predictor are automatically valid, however their tightness and therefore usefulness depends on the nonconformity measure used by each CP. In effect a nonconformity measure evaluates how strange a given example is compared to a set of other examples based on some traditional machine learning algorithm. We define six novel nonconformity measures based on the k -Nearest Neighbours Regression algorithm and develop the corresponding CPs following both the original (transductive) and the inductive CP approaches. A comparison of the predictive regions produced by our measures with those of the typical regression measure suggests that a major improvement in terms of predictive region tightness is achieved by the new measures.
IFIP Advances in Information and Communication Technology, 2012
Current classification algorithms focus on vectorial data, given in euclidean or kernel spaces. Many real world data, like biological sequences are not vectorial and often non-euclidean, given by (dis-)similarities only, requesting for efficient and interpretable models. Current classifiers for such data require complex transformations and provide only crisp classification without any measure of confidence, which is a standard requirement in the life sciences. In this paper we propose a prototype-based conformal classifier for dissimilarity data. It effectively deals with dissimilarity data. The model complexity is automatically adjusted and confidence measures are provided. In experiments on dissimilarity data we investigate the effectiveness with respect to accuracy and model complexity in comparison to different state of the art classifiers.
Annals of Mathematics and Artificial Intelligence, 2014
Existing classification algorithms focus on vectorial data given in Euclidean space or representations by means of positive semi-definite kernel matrices. Many real world data, like biological sequences are not vectorial, often non-euclidean and given only in the form of (dis-)similarities between examples, requesting for efficient and interpretable models. Vectorial embeddings or transformations to get a valid kernel are limited and current dissimilarity classifiers often lead to dense complex models which are hard to interpret by domain experts. They also fail to provide additional information about the confidence of the classification. In this paper we propose a prototype-based conformal classifier for dissimilarity data. It is based on a prototype dissimilarity learner and extended by the conformal prediction methodology. It (i) can deal with dissimilarity data characterized by an arbitrary symmetric dissimilarity matrix, (ii) offers intuitive classification in terms of sparse prototypical class representatives, (iii) leads to state-of-the-art classification results supported by a confidence measure and (iv) the model complexity is automatically adjusted. In experiments on dissimilarity data we investigate the effectiveness with respect to accuracy and model complexity in comparison to different state of the art classifiers.
2004
The Support Vector Machine (SVM) has been extended to build up nonlinear classifiers using the kernel trick [1– 3]. As a learning model, it has the best recognition performance among the many methods currently known because it is devised to obtain high performance for unlearned data. The SVM uses linear threshold elements to build up two-classes classifier. It learns linear threshold element parameters based on “margin maximization” from training samples. This paper reviews how to enhance generalization in learning classifiers. The SVM is introduced, then multiple regression analysis (MRA) and logistic regression analysis (LRA) are explained as the statistical methods for building up a classifier with a structure similar to that for the SVM. The same method as used for the SVM can be introduced in both MRA and LRA to enhance performance for unlearned samples. This paper reviews how to enhance generalization in classifier learning and compares the SVM with these methods at the criter...
2020
In this paper we introduce a nearest neighbor based estimate of the prediction interval with prescribed conditional coverage probability and with small length. In the special case, when there is no feature vector, the problem is the estimate of a confidence interval. For confidence interval estimate, we show the distribution-free strong consistency of the conditional coverage probability and of excess length of the interval, while the conditional coverage probability of prediction interval has the distribution-free strong consistency property and under weak conditions on the underlying distributions strong consistency and the fast rate of convergence of the excess length are shown. As a consequence, we construct a confidence set estimate for classification. AMS Classification: 62G08, 62G20.
IFIP Advances in Information and Communication Technology, 2014
Unlike the typical classification setting where each instance is associated with a single class, in multi-label learning each instance is associated with multiple classes simultaneously. Therefore the learning task in this setting is to predict the subset of classes to which each instance belongs. This work examines the application of a recently developed framework called Conformal Prediction (CP) to the multi-label learning setting. CP complements the predictions of machine learning algorithms with reliable measures of confidence. As a result the proposed approach instead of just predicting the most likely subset of classes for a new unseen instance, also indicates the likelihood of each predicted subset being correct. This additional information is especially valuable in the multi-label setting where the overall uncertainty is extremely high.
IEEE Access, 2018
Proper tuning of hyper-parameters is essential to the successful application of SVM-classifiers. Several methods have been used for this problem: grid search, random search, estimation of distribution Algorithms (EDAs), bio-inspired metaheuristics, among others. The objective of this paper is to determine the optimal method among those that recently reported good results: Bat algorithm, Firefly algorithm, Fruit-fly optimization algorithm, particle Swarm optimization, Univariate Marginal Distribution Algorithm (UMDA), and Boltzmann-UMDA. The criteria for optimality include measures of effectiveness, generalization, efficiency, and complexity. Experimental results on 15 medical diagnosis problems reveal that EDAs are the optimal strategy under such criteria. Finally, a novel performance index to guide the optimization process, that improves the generalization of the solutions while maintaining their effectiveness, is presented.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.