Inproceedings,

Non-Disjoint Discretization for Naive-Bayes Classifiers

, and .
Proceedings of the Nineteenth International Conference on Machine Learning (ICML '02), page 666-673. San Francisco, Morgan Kaufmann, (2002)

Abstract

Previous discretization techniques have discretized numeric attributes into disjoint intervals. We argue that this is neither necessary nor appropriate for naive-Bayes classifiers. The analysis leads to a new discretization method, Non-Disjoint Discretization (NDD). NDD forms overlapping intervals for a numeric attribute, always locating a value toward the middle of its discretized interval to obtain more reliable probability estimation. It also adjusts the number and size of discretized intervals to the number of training instances, seeking an appropriate trade-off between bias and variance of probability estimation. We justify NDD in theory and test it on a wide cross-section of datasets. Our experimental results suggest that for naive-Bayes classifiers, NDD works better than alternative discretization approaches.

Tags

Users

  • @giwebb
  • @dblp

Comments and Reviews