@kmd-ovgu

Correcting the Usage of the Hoeffding Inequality in Stream Mining

, , and . Advances in Intelligent Data Analysis XII, volume 8207 of Lecture Notes in Computer Science, Springer Berlin Heidelberg, (2013)
DOI: 10.1007/978-3-642-41398-8_26

Abstract

Many stream classification algorithms use the Hoeffding Inequality to identify the best split attribute during tree induction. We show that the prerequisites of the Inequality are violated by these algorithms, and we propose corrective steps. The new stream classification core, correctedVFDT, satisfies the prerequisites of the Hoeffding Inequality and thus provides the expected performance guarantees. The goal of our work is not to improve accuracy, but to guarantee a reliable and interpretable error bound. Nonetheless, we show that our solution achieves lower error rates regarding split attributes and sooner split decisions while maintaining a similar level of accuracy.

Links and resources

Tags

community

  • @kmd-ovgu
  • @matuszyk
  • @dblp
@kmd-ovgu's tags highlighted