Inproceedings,

Non-Disjoint Discretization for Naive-Bayes Classifiers

Y. Yang, and G. Webb.
Proceedings of the Nineteenth International Conference on Machine Learning (ICML '02), page 666-673. San Francisco, Morgan Kaufmann, (2002)

Abstract

Previous discretization techniques have discretized numeric attributes into disjoint intervals. We argue that this is neither necessary nor appropriate for naive-Bayes classifiers. The analysis leads to a new discretization method, Non-Disjoint Discretization (NDD). NDD forms overlapping intervals for a numeric attribute, always locating a value toward the middle of its discretized interval to obtain more reliable probability estimation. It also adjusts the number and size of discretized intervals to the number of training instances, seeking an appropriate trade-off between bias and variance of probability estimation. We justify NDD in theory and test it on a wide cross-section of datasets. Our experimental results suggest that for naive-Bayes classifiers, NDD works better than alternative discretization approaches.

BibTeX key: YangWebb02b
entry type: inproceedings
address: San Francisco
booktitle: Proceedings of the Nineteenth International Conference on Machine Learning (ICML '02)
year: 2002
pages: 666-673
publisher: Morgan Kaufmann
audit-trail: Posted by Ying at http://www.cs.uvm.edu/~yyang/ndd.pdf No link on GW page - 9/2/05 requested permission
location: Sydney, Australia

BibSonomy

Non-Disjoint Discretization for Naive-Bayes Classifiers

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on