Academia.eduAcademia.edu

Mortgage Default: Classification Trees Analysis

2005, The Journal of Real Estate Finance and Economics

Abstract

We apply the powerful, flexible, and computationally efficient nonparametric Classification and Regression Trees (CART) algorithm to analyze real estate mortgage data. CART is particularly appropriate for our data set because of its strengths in dealing with large data sets, high dimensionality, mixed data types, missing data, different relationships between variables in different parts of the measurement space, and outliers. Moreover, CART is intuitive and easy to interpret and implement. We discuss the pros and cons of CART in relation to traditional methods such as linear logistic regression, nonparametric additive logistic regression, discriminant analysis, partial least squares classification, and neural networks, with particular emphasis on real estate. We use CART to produce the first academic study of Israeli mortgage default data. We find that borrowers' features, rather than mortgage contract features, are the strongest predictors of default if accepting "bad" borrowers is more costly than rejecting "good" ones. If the costs are equal, mortgage features are used as well. The higher (lower) the ratio of misclassification costs of bad risks versus good ones, the lower (higher) are the resulting misclassification rates of bad risks and the higher (lower) are the misclassification rates of good ones. This is consistent with real-world rejection of good risks in an attempt to avoid bad ones.

Key takeaways

  • As far as we know, this is the first application of CART in an academic study of real estate data and the first academic mortgage default study of Israeli data.
  • To describe this important feature of CART, we will schematically describe (in Section 2) the binary classification tree that CART produces, couching the description in our example of mortgage applicants' risk assessment when necessary.
  • At each node, CART considers all available features and all possible splits on those features to choose the best feature and the best split that will create the least internally diverse pair of daughter nodes.
  • CART grows the largest tree possible, called a maximal tree, whose leaves (terminal nodes) cannot be split any further.
  • We emphasized the process of selecting a final classification tree, which depends both on the CART method and the particular subject matter at hand.