Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2006
…
10 pages
1 file
Accurate probability estimation generated by learning models is desirable in some practical applications, such as medical diagnosis. In this paper, we empirically study traditional decision-tree learning models and their variants in terms of probability estimation, measured by Conditional Log Likelihood (CLL). Furthermore, we also compare decision tree learning with other kinds of representative learning: naïve Bayes, Naïve Bayes Tree, Bayesian Network, K-Nearest Neighbors and Support Vector Machine with respect to probability estimation. From our experiments, we have several interesting observations. First, among various decision-tree learning models, C4.4 is the best in yielding precise probability estimation measured by CLL, although its performance is not good in terms of other evaluation criteria, such as accuracy and ranking. We provide an explanation for this and reveal the nature of CLL. Second, compared with other popular models, C4.4 achieves the best CLL. Finally, CLL does not dominate another wellestablished relevant measurement AUC (the Area Under the Curve of Receiver Operating Characteristics), which suggests that different decision-tree learning models should be used for different objectives. Our experiments are conducted on the basis of 36 UCI sample sets that cover a wide range of domains and data characteristics. We run all the models within a machine learning platform -Weka.
Journal of medical systems, 2002
In medical decision making (classification, diagnosing, etc.) there are many situations where decision must be made effectively and reliably. Conceptual simple decision making models with the possibility of automatic learning are the most appropriate for performing such tasks. Decision trees are a reliable and effective decision making technique that provide high classification accuracy with a simple representation of gathered knowledge and they have been used in different areas of medical decision making. In the paper we present the basic characteristics of decision trees and the successful alternatives to the traditional induction approach with the emphasis on existing and possible future applications in medicine.
International Journal of Trend in Scientific Research and Development, 2019
Data mining techniques are rapidly developed for many applications. In recent year, Data mining in healthcare is an emerging field research and development of intelligent medical diagnosis system. Classification is the major research topic in data mining. Decision trees are popular methods for classification. In this paper many decision tree classifiers are used for diagnosis of medical datasets. AD Tree, J48, NB Tree, Random Tree and Random Forest algorithms are used for analysis of medical dataset. Heart disease dataset, Diabetes dataset and Hepatitis disorder dataset are used to test the decision tree models.
Neural Computing and Applications, 2012
Decision support systems help physicians and also play an important role in medical decision-making. They are based on different models, and the best of them are providing an explanation together with an accurate, reliable and quick response. This paper presents a decision support tool for the detection of breast cancer based on three types of decision tree classifiers. They are single decision tree (SDT), boosted decision tree (BDT) and decision tree forest (DTF). Decision tree classification provides a rapid and effective method of categorizing data sets. Decision-making is performed in two stages: training the classifiers with features from Wisconsin breast cancer data set, and then testing. The performance of the proposed structure is evaluated in terms of accuracy, sensitivity, specificity, confusion matrix and receiver operating characteristic (ROC) curves. The results showed that the overall accuracies of SDT and BDT in the training phase achieved 97.07 % with 429 correct classifications and 98.83 % with 437 correct classifications, respectively. BDT performed better than SDT for all performance indices than SDT. Value of ROC and Matthews correlation coefficient (MCC) for BDT in the training phase achieved 0.99971 and 0.9746, respectively, which was superior to SDT classifier. During validation phase, DTF achieved 97.51 %, which was superior to SDT (95.75 %) and BDT (97.07 %) classifiers. Value of ROC and MCC for DTF achieved 0.99382 and 0.9462, respectively. BDT showed the best performance in terms of sensitivity, and SDT was the best only considering speed. Keywords Computer-aided diagnosis (CAD) Á Decision support systems (DSS) Á Decision tree classification Á Single decision tree Á Boosted decision tree Á Decision tree forest Á k-fold cross-validation
International Journal of Computer …, 2011
In data mining, classification is one of the significant techniques with applications in fraud detection, Artificial intelligence, Medical Diagnosis and many other fields. Classification of objects based on their features into predefined categories is a widely studied problem. Decision trees are very much useful to diagnose a patient problem by the physicians. Decision tree classifiers are used extensively for diagnosis of breast tumour in ultrasonic images, ovarian cancer and heart sound diagnosis. In this paper, performance of decision tree induction classifiers on various medical data sets in terms of accuracy and time complexity are analysed.
2019
Data mining and machine learning (ML) are increasingly at the core of many aspects of modern life. With growing concerns about the impact of relying on predictions we cannot understand, there is widespread agreement regarding the need for reliable interpretable models. One of the areas where this is particularly important is clinical decision-making. Specifically, explainable models have the potential to facilitate the elaboration of clinical guidelines and related decision-support tools. The presented research focuses on the improvement of decision tree (DT) learning, one of the most popular interpretable models, motivated by the challenges posed by clinical data. One of the limitations of interpretable DT algorithms is that they involve decisions based on strict thresholds, which can impair performance in the presence noisy measurements. In this regard, we proposed a probabilistic method that takes into account a model of the noise in the distinct learning phases. When considering...
Decision tree is one of the classification techniques for classifying sequential decision problems such as those in medical domain. This paper discusses an evaluation study on different single decision tree classifi-ers. There are various single decision tree classifiers which have been extensively applied in medical decision making; each of these classifies the data with different accuracy rate. Since accuracy is crucial in medical decision making, it is important to identify a classifier with the best accuracy. The study examines the performance of fourteen single decision tree classi-fiers on three medical data sets, i.e. Wisconsin's breast cancer data sets, Pi-ma Indian diabetes data sets and hepatitis data sets. All classifiers were trained and tested using WEKA and cross validation. The results revealed that classifiers such as FT, LMT, NB tree, Random Forest and Random Tree are the five best single classifiers as they constantly provide better accuracy in their classifications.
IJCSE) International Journal on …, 2010
Classification is one of the most efficient and widely used data mining technique. In classification, Decision trees can handle high dimensional data, and their representation is intuitive and generally easy to assimilate by humans. The area under the receiver operating characteristic curve, AUC is one of the recently used measures for calculating the performance of a classifier.In this paper, we presented two novel decision tree algorithms namely C4.45 and C4.55, aimed to improve the AUC value over the C4.5, which is a state-of-the-art decision tree algorithm. The empirical experiments conducted on 42 benchmark datasets have strongly indicated that C4.45 and C4.55 has significantly outperformed C4.5 on the AUC value.
IEEE Transactions on Knowledge and Data Engineering, 2020
Clinical decision requires reasoning in the presence of imperfect data. DTs are a well-known decision support tool, owing to their interpretability, fundamental in safety-critical contexts such as medical diagnosis. However, learning DTs from uncertain data leads to poor generalization, and generating predictions for uncertain data hinders prediction accuracy. Several methods have suggested the potential of probabilistic decisions at the internal nodes in making DTs robust to uncertainty. Some approaches only employ probabilistic thresholds during evaluation. Others also consider the uncertainty in the learning phase, at the expense of increased computational complexity or reduced interpretability. The existing methods have not clarified the merit of a probabilistic approach in the distinct phases of DT learning, nor when the uncertainty is present in the training or the test data. We present a probabilistic DT approach that models measurement uncertainty as a noise distribution, independently realized: (1) when searching for the split thresholds, (2) when splitting the training instances, and (3) when generating predictions for unseen data. The soft training approaches (1, 2) achieved a regularizing effect, leading to significant reductions in DT size, while maintaining accuracy, for increased noise. Soft evaluation (3) showed no benefit in handling noise.
Journal of medical systems, 2000
Decision support systems that help physicians are becoming a very important part of medical decision making. They are based on different models and the best of them are providing an explanation together with an accurate, reliable, and quick response. One of the most viable among models are decision trees, already successfully used for many medical decision-making purposes. Although effective and reliable, the traditional decision tree construction approach still contains several deficiencies. Therefore we decided to develop and compare several decision support models using four different approaches. We took statistical analysis, a MtDeciT, in our laboratory developed tool for building decision trees with a classical method, the well-known C5.0 tool and a self-adapting evolutionary decision support model that uses evolutionary principles for the induction of decision trees. Several solutions were evolved for the classification of metabolic and respiratory acidosis (MRA). A comparison...
Proceedings of the 12th International Joint Conference on Biomedical Engineering Systems and Technologies, 2019
Uncertainty is an intrinsic component of the clinical practice, which manifests itself in a variety of different forms. Despite the growing popularity of Machine Learning-based Decision Support Systems (ML-DSS) in the clinical domain, the effects of the uncertainty that is inherent in the medical data used to train and optimize these systems remain largely under-considered in the Machine Learning community, as well as in the health informatics one. A particularly common type of uncertainty arising in the clinical decision-making process is related to the ambiguity resulting from either lack of decisive information (lack of evidence) or excess of discordant information (lack of consensus). Both types of uncertainty create the opportunity for clinicians to abstain from making a clear-cut classification of the phenomenon under observation and consideration. In this work, we study a Machine Learning model endowed with the ability to directly work with both sources of imperfect information mentioned above. In order to investigate the possible trade-off between accuracy and uncertainty given by the possibility of abstention, we performed an evaluation of the considered model, against a variety of standard Machine Learning algorithms, on a real-world clinical classification problem. We report promising results in terms of commonly used performance metrics.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.
2010 IEEE International Symposium on Biomedical Imaging: From Nano to Macro, 2010
Fifth IEEE International Conference on Data Mining (ICDM'05)
2013
International Journal of Engineering & Technology, 2018
Machine Learning: ECML …, 2005
IEEE Transactions on Information Technology in Biomedicine, 2007
Computers and Biomedical Research, 1993
IFIP Advances in Information and Communication Technology, 2011
2006 5th International Conference on Machine Learning and Applications (ICMLA'06), 2006