Correcting the Usage of the Hoeffding Inequality in Stream Mining

P. Matuszyk, G. Krempl, and M. Spiliopoulou. Advances in Intelligent Data Analysis XII , volume 8207 of Lecture Notes in Computer Science, Springer Berlin Heidelberg, (2013)


Many stream classification algorithms use the Hoeffding Inequality to identify the best split attribute during tree induction. We show that the prerequisites of the Inequality are violated by these algorithms, and we propose corrective steps. The new stream classification core, correctedVFDT, satisfies the prerequisites of the Hoeffding Inequality and thus provides the expected performance guarantees. The goal of our work is not to improve accuracy, but to guarantee a reliable and interpretable error bound. Nonetheless, we show that our solution achieves lower error rates regarding split attributes and sooner split decisions while maintaining a similar level of accuracy.

Links and resources

BibTeX key:
search on:

Comments and Reviews  

There is no review or comment yet. You can write one!


Cite this publication