An Efficient Algorithm for Mining Frequent Itemsets over the Entire History of Data Streams

Abstract

A data stream is a continuous, huge, fast changing, rapid, infinite sequence of data elements. The nature of streaming data makes it essential to use online algorithms which require only one scan over the data for knowledge discovery. In this paper, we propose a new single-pass algorithm, called DSMFI (Data Stream Mining for Frequent Itemsets), to mine all frequent itemsets over the entire history of data streams. DSM-FI has three major features, namely single streaming data scan for counting itemsets’ frequency information, extended prefix-tree-based compact pattern representation, and top-down frequent itemset discovery scheme. Our performance study shows that DSM-FI outperforms the well-known algorithm Lossy Counting in the same streaming environment.

BibTeX key: li:mining
entry type: inproceedings
booktitle: Proceedings of First International Workshop on Knowledge Discovery in Data Streams
year: 2004

BibSonomy

An Efficient Algorithm for Mining Frequent Itemsets over the Entire History of Data Streams

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on