As described by Breiman et al. [11], classification trees are generated by making repeated binary splits in multi-class feature data such that the resulting two populations are more homogeneous than the original. Figure 1.8 shows a classification tree after the initial split of the data. >>>>
>>>>
![]() |
The splits in the data are chosen such that they maximize a criterion designed to measure the difference in homogeneity between a node and its children. Breiman et al. [11] define two characteristics that such a criterion should meet: >>>>
A decision tree was among the first classifiers used because of its easy interpretation. Simply by traversing down the tree from the root to a leaf, one obtains a feature-based classification rule that defines which input values will cause a sample to end up at that leaf node. >>>>
One problem with decision trees is that their decision boundaries are not particularly complex (unless the tree itself becomes exceedingly complex). To improve upon the results obtained with a classification tree, other classifiers were developed. >>>>
>>>>