Abstract
Classification is a frequently encountered data mining
problem. Decision tree techniques have been widely used
to build classification models as such models closely
resemble human reasoning and are easy to understand.
Many real-world classification problems are
cost-sensitive, meaning that different types of
misclassification errors are not equally costly. Since
different decision trees may excel under different cost
settings, a set of non-dominated decision trees should
be developed and presented to the decision maker for
consideration, if the costs of different types of
misclassification errors are not precisely determined.
This paper proposes a multi-objective genetic
programming approach to developing such alternative
Pareto optimal decision trees. It also allows the
decision maker to specify partial preferences on the
conflicting objectives, such as false negative vs.
false positive, sensitivity vs. specificity, and recall
vs. precision, to further reduce the number of
alternative solutions. A diabetes prediction problem
and a credit card application approval problem are used
to illustrate the application of the proposed
approach.
Users
Please
log in to take part in the discussion (add own reviews or comments).