Module -3 Classification
1. Describe the general approach to classification in machine learning.
2. Write the steps in decision tree construction.
3. How are decision trees used for classification? Explain with an example
4. Explain a. Gain Ratio
b. Gini Index
5. Explain common approaches of tree pruning
6. Why is tree pruning useful in decision tree induction? What is a drawback of using a separate set
of tuples to evaluate pruning?
7. Why naïve Bayesian classification called “naïve”? Briefly outline the major steps of naïve
Bayesian classification algorithm
8. Explain rule-based classification using IF-THEN rules
9. Describe Rule Induction using a sequential covering algorithm
10. Explain a. Rule Quality measures
b. Rule Pruning
11. Explain the different metrics for Evaluating Classifier performance.
12. Explain how well classifier can recognize tuples of different classes using confusion matrix
13. Explain the four additional aspects beyond accuracy-based measures that are used for
Comparing classifiers.
14. Explain a. Handout method and Random Subsampling
b. Cross validation
15. Give statistical tests of significance for model selection between any two classification models
M1 and M2.
16. Compare classifiers based on cost-benefit and ROC curves
17. List the techniques to improve classification accuracy and explain ensemble methods
18. Write bagging algorithm, which is used as a method of increasing accuracy
19. Write AdaBoost algorithm an ensemble method of boosting
20. Explain how we can improve classification accuracy of imbalanced data
21. Explain simple Bayesian belief network
22. Describe SVM, as a method for classification of both linear and nonlinear data
23. Write the underlying equations of SVM for classifying linearly separable data
Construct a set of IF-THEN rules for the given decision tree [figure]
25. Compute the information gain and expected information of attribute age, needed to
classify a tuple in D using the following Table
1.