Academia.eduAcademia.edu

Probability Machines

2012, Methods of Information in Medicine

Abstract

SummaryBackground: Most machine learning approaches only provide a classification for binary responses. However, probabilities are required for risk estimation using individual patient characteristics. It has been shown recently that every statistical learning machine known to be consistent for a nonparametric regression problem is a probability machine that is provably consistent for this estimation problem.Objectives: The aim of this paper is to show how random forests and nearest neighbors can be used for consistent estimation of individual probabilities.Methods: Two random forest algorithms and two nearest neighbor algorithms are described in detail for estimation of individual probabilities. We discuss the consistency of random forests, nearest neighbors and other learning machines in detail. We conduct a simulation study to illustrate the validity of the methods. We exemplify the algorithms by analyzing two well-known data sets on the diagnosis of appendicitis and the diagnosi...

Loading...

Loading Preview

Sorry, preview is currently unavailable. You can download the paper by clicking the button above.