Papers by Ilia Nouretdinov
Theoretical Computer Science, Nov 1, 2006
This paper is concerned with the problem of on-line prediction in the situation where some data i... more This paper is concerned with the problem of on-line prediction in the situation where some data is unlabelled and can never be used for prediction, and even when data is labelled, the labels may arrive with a delay. We construct a modification of randomised Transductive Confidence Machine for this case and prove a necessary and sufficient condition for its predictions being calibrated, in the sense that in the long run they are wrong with a prespecified probability under the assumption that data is generated independently by same distribution. The condition for calibration turns out to be very weak: feedback should be given on more than a logarithmic fraction of steps.
IFIP advances in information and communication technology, 2010
Conformal predictors represent a new flexible framework that outputs region predictions with a gu... more Conformal predictors represent a new flexible framework that outputs region predictions with a guaranteed error rate. Efficiency of such predictions depends on the nonconformity measure that underlies the predictor. In this work we designed new nonconformity measures based on a random forest classifier. Experiments demonstrate that proposed conformal predictors are more efficient than current benchmarks on noisy mass spectrometry data (and at least as efficient on other type of data) while maintaining the property of validity: they output fewer multiple predictions, and the ratio of mistakes does not exceed the preset level. When forced to produce singleton predictions, the designed conformal predictors are at least as accurate as the benchmarks and sometimes significantly outperform them.

Lecture Notes in Computer Science, 2001
When correct priors are known, Bayesian algorithms give optimal decisions, and accurate confidenc... more When correct priors are known, Bayesian algorithms give optimal decisions, and accurate confidence values for predictions can be obtained. If the prior is incorrect however, these confidence values have no theoretical base -even though the algorithms' predictive performance may be good. There also exist many successful learning algorithms which only depend on the iid assumption. Often however they produce no confidence values for their predictions. Bayesian frameworks are often applied to these algorithms in order to obtain such values, however they can rely on unjustified priors. In this paper we outline the typicalness framework which can be used in conjunction with many other machine learning algorithms. The framework provides confidence information based only on the standard iid assumption and so is much more robust to different underlying data distributions. We show how the framework can be applied to existing algorithms. We also present experimental results which show that the typicalness approach performs close to Bayes when the prior is known to be correct. Unlike Bayes however, the method still gives accurate confidence values even when different data distributions are considered.
International Conference on Machine Learning, Jun 28, 2001
... Page 4. 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000000000 000000 00... more ... Page 4. 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000000000 000000 000000 000000 111111 111111 111111 111111 111111 111111 111111 111111 111111 111111 111111 111111 111111 111111 y iy ~ i α l+1 ~ α α l+1 y^ ...

When correct priors are known, Bayesian algorithms give optimal decisions, and accurate confidenc... more When correct priors are known, Bayesian algorithms give optimal decisions, and accurate confidence values for predictions can be obtained. If the prior is incorrect however, these confidence values have no theoretical base -even though the algorithms' predictive performance may be good. There also exist many successful learning algorithms which only depend on the iid assumption. Often however they produce no confidence values for their predictions. Bayesian frameworks are often applied to these algorithms in order to obtain such values, however they can rely on unjustified priors. In this paper we outline the typicalness framework which can be used in conjunction with many other machine learning algorithms. The framework provides confidence information based only on the standard iid assumption and so is much more robust to different underlying data distributions. We show how the framework can be applied to existing algorithms. We also present experimental results which show that the typicalness approach performs close to Bayes when the prior is known to be correct. Unlike Bayes however, the method still gives accurate confidence values even when different data distributions are considered.
IFIP Advances in Information and Communication Technology, 2010
Conformal predictors represent a new flexible framework that outputs region predictions with a gu... more Conformal predictors represent a new flexible framework that outputs region predictions with a guaranteed error rate. Efficiency of such predictions depends on the nonconformity measure that underlies the predictor. In this work we designed new nonconformity measures based on a random forest classifier. Experiments demonstrate that proposed conformal predictors are more efficient than current benchmarks on noisy mass spectrometry data (and at least as efficient on other type of data) while maintaining the property of validity: they output fewer multiple predictions, and the ratio of mistakes does not exceed the preset level. When forced to produce singleton predictions, the designed conformal predictors are at least as accurate as the benchmarks and sometimes significantly outperform them.

Lecture Notes in Computer Science, 2005
We consider a general class of forecasting protocols, called "linear protocols", and discuss seve... more We consider a general class of forecasting protocols, called "linear protocols", and discuss several important special cases, including multi-class forecasting. Forecasting is formalized as a game between three players: Reality, whose role is to generate observations; Forecaster, whose goal is to predict the observations; and Skeptic, who tries to make money on any lack of agreement between Forecaster's predictions and the actual observations. Our main mathematical result is that for any continuous strategy for Skeptic in a linear protocol there exists a strategy for Forecaster that does not allow Skeptic's capital to grow. This result is a meta-theorem that allows one to transform any continuous law of probability in a linear protocol into a forecasting strategy whose predictions are guaranteed to satisfy this law. We apply this meta-theorem to a weak law of large numbers in Hilbert spaces to obtain a version of the K29 prediction algorithm for linear protocols and show that this version also satisfies the attractive properties of proper calibration and resolution under a suitable choice of its kernel parameter, with no assumptions about the way the data is generated.

Lecture Notes in Computer Science, 2001
When correct priors are known, Bayesian algorithms give optimal decisions, and accurate confidenc... more When correct priors are known, Bayesian algorithms give optimal decisions, and accurate confidence values for predictions can be obtained. If the prior is incorrect however, these confidence values have no theoretical base -even though the algorithms' predictive performance may be good. There also exist many successful learning algorithms which only depend on the iid assumption. Often however they produce no confidence values for their predictions. Bayesian frameworks are often applied to these algorithms in order to obtain such values, however they can rely on unjustified priors. In this paper we outline the typicalness framework which can be used in conjunction with many other machine learning algorithms. The framework provides confidence information based only on the standard iid assumption and so is much more robust to different underlying data distributions. We show how the framework can be applied to existing algorithms. We also present experimental results which show that the typicalness approach performs close to Bayes when the prior is known to be correct. Unlike Bayes however, the method still gives accurate confidence values even when different data distributions are considered.
Theoretical Computer Science, 2006
This paper is concerned with the problem of on-line prediction in the situation where some data i... more This paper is concerned with the problem of on-line prediction in the situation where some data is unlabelled and can never be used for prediction, and even when data is labelled, the labels may arrive with a delay. We construct a modification of randomised Transductive Confidence Machine for this case and prove a necessary and sufficient condition for its predictions being calibrated, in the sense that in the long run they are wrong with a prespecified probability under the assumption that data is generated independently by same distribution. The condition for calibration turns out to be very weak: feedback should be given on more than a logarithmic fraction of steps.
MACHINE LEARNING- …, 2001
... Page 4. 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000000000 000000 00... more ... Page 4. 000000 000000 000000 000000 000000 000000 000000 000000 000000 000000000000 000000 000000 000000 111111 111111 111111 111111 111111 111111 111111 111111 111111 111111 111111 111111 111111 111111 y iy ~ i α l+1 ~ α α l+1 y^ ...
… Holloway, University of …, 2001
When correct priors are known, Bayesian algorithms give optimal decisions, and accurate con dence... more When correct priors are known, Bayesian algorithms give optimal decisions, and accurate con dence values for predictions can be obtained. If the prior is incorrect however, these con dence values have no theoretical base {even though the algorithms' predictive ...
Venn Machine is a recently developed machine learning framework for reliable probabilistic predic... more Venn Machine is a recently developed machine learning framework for reliable probabilistic prediction of the labels for new examples/. This work proposes a way to extend Venn machine to the framework known as Learning Under Privileged Information: some additional features are available for a part of the training set, and are missing for the example being predicted. We suggest obtaining use from this information by making a it taxonomy transfer where taxonomy is the core detail of Venn Machine framework so that the transfer is done from the examples with additional information to the examples without additional information.
Lecture Notes in Computer Science, 2018

Security and Privacy Trends in the Industrial Internet of Things, 2019
Some type of privacy-preserving transformation must be applied to any data record from Industrial... more Some type of privacy-preserving transformation must be applied to any data record from Industrial Internet of Things (IIoT) before it is disclosed to the researchers or analysts. Based on the existing privacy models such as Differential Privacy (DP) and k-anonymity, we extend the DP model to explicitly incorporate feature dependencies, and to produce guarantees of privacy in a probabilistic form that generalize k-anonymity. We assume that additional (external) knowledge of these relations and models can be represented in the form of joint probability distributions, such as Mutual Information (MI). We propose an enhanced definition of DP in conjunction with a realisation for non-randomizing anonymizing strategies such as binning, reducing the extent of binning required and preserving more valuable information for researchers. This allows the formulation of privacy conditions over the evolving set of features such that each feature can be associated its own allowance for privacy budget. As a case study, we consider an example from the Industrial Medical Internet of Things (IMIoT). We have identified some challenges that are not completely addressed by existing privacy models. Unlike physiological measurements in conventional medical environments, IMIoT is likely to result in duplicate and overlapping measurements, which can be associated with different personally identifiable items of information. As an example, we present a model of sequential feature collection.
The aim of this work is to discuss abnormality detection and explanation challenges motivated by ... more The aim of this work is to discuss abnormality detection and explanation challenges motivated by Medical Internet of Things. First, any feature is a measurement taken by a sensor at a time moment, so abnormality detection also becomes a sequential process. Second, an anomaly detection process could not rely on having a large collection of data records, but instead there is a knowledge provided by the experts.

Procedia Computer Science, 2017
The Medical Internet of Things (MIoT) has applications beyond clinical settings including in outp... more The Medical Internet of Things (MIoT) has applications beyond clinical settings including in outpatient and care environments where monitoring is occurring over public networks and may involve non-dedicated devices. This poses a number of security and privacy challenges exacerbated by a heterogeneous and dynamic environment, but still requires standards for handling personally identifiable and medical information of patients and in some cases caregivers to be maintained. Whilst risk and threat assessments generally assume a stable and well-defined environment, this cannot be done in MIoT environments where devices may be added, removed, or changed in their configuration including connectivity to server back ends. Conducting a complete threat assessment for each such configuration changes is infeasible. In this paper, we seek to define a mechanism for prioritising MIoT threats and aspects of the analysis that are likely to be affected by composition and related alterations. We propose a mechanism based on the UK HMG IS1 1 approach and provide a case study in the form of the Technology Integrated Health Management (TIHM) 2 test bed.
Advances in Science, Technology and Engineering Systems Journal, 2017
Conformal Prediction is a recently developed framework for reliable confident predictions. In thi... more Conformal Prediction is a recently developed framework for reliable confident predictions. In this work we discuss its possible application to big data coming from different, possibly heterogeneous data sources. On example of anomaly detection problem, we study the question of saving validity of Conformal Prediction in this case. We show that the straight forward averaging approach is invalid, while its easy alternative of maximizing is not very efficient because of its conservativeness. We propose the third compromised approach that is valid, but much less conservative. It is supported by both theoretical justification and experimental results in the area of energy engineering.

arXiv (Cornell University), Apr 9, 2009
In this paper we apply computer learning methods to diagnosing ovarian cancer using the level of ... more In this paper we apply computer learning methods to diagnosing ovarian cancer using the level of the standard biomarker CA125 in conjunction with information provided by mass-spectrometry. We are working with a new data set collected over a period of 7 years. Using the level of CA125 and mass-spectrometry peaks, our algorithm gives probability predictions for the disease. To estimate classification accuracy we convert probability predictions into strict predictions. Our algorithm makes fewer errors than almost any linear combination of the CA125 level and one peak's intensity (taken on the log scale). To check the power of our algorithm we use it to test the hypothesis that CA125 and the peaks do not contain useful information for the prediction of the disease at a particular time before the diagnosis. Our algorithm produces p-values that are better than those produced by the algorithm that has been previously applied to this data set. Our conclusion is that the proposed algorithm is more reliable for prediction on new data.

arXiv (Cornell University), Apr 15, 2012
A standard assumption in machine learning is the exchangeability of data, which is equivalent to ... more A standard assumption in machine learning is the exchangeability of data, which is equivalent to assuming that the examples are generated from the same probability distribution independently. This paper is devoted to testing the assumption of exchangeability on-line: the examples arrive one by one, and after receiving each example we would like to have a valid measure of the degree to which the assumption of exchangeability has been falsified. Such measures are provided by exchangeability martingales. We extend known techniques for constructing exchangeability martingales and show that our new method is competitive with the martingales introduced before. Finally we investigate the performance of our testing method on two benchmark datasets, USPS and Statlog Satellite data; for the former, the known techniques give satisfactory results, but for the latter our new more flexible method becomes necessary.
Chapman and Hall/CRC eBooks, Dec 19, 2011
Uploads
Papers by Ilia Nouretdinov