Active Learning From Stream Data Using Optimal Weight Classifier
Ensemble.
X. Zhu, P. Zhang, X. Lin, und Y. Shi. IEEE transactions on systems, man, and cybernetics. Part B, Cybernetics
: a publication of the IEEE Systems, Man, and Cybernetics Society, (April 2010)
DOI: 10.1109/TSMCB.2010.2042445
Zusammenfassung
In this paper, we propose a new research problem on active learning
from data streams, where data volumes grow continuously, and labeling
all data is considered expensive and impractical. The objective is
to label a small portion of stream data from which a model is derived
to predict future instances as accurately as possible. To tackle
the technical challenges raised by the dynamic nature of the stream
data, i.e., increasing data volumes and evolving decision concepts,
we propose a classifier-ensemble-based active learning framework
that selectively labels instances from data streams to build a classifier
ensemble. We argue that a classifier ensemble's variance directly
corresponds to its error rate, and reducing a classifier ensemble's
variance is equivalent to improving its prediction accuracy. Because
of this, one should label instances toward the minimization of the
variance of the underlying classifier ensemble. Accordingly, we introduce
a minimum-variance (MV) principle to guide the instance labeling
process for data streams. In addition, we derive an optimal-weight
calculation method to determine the weight values for the classifier
ensemble. The MV principle and the optimal weighting module are combined
to build an active learning framework for data streams. Experimental
results on synthetic and real-world data demonstrate the performance
of the proposed work in comparison with other approaches.
%0 Journal Article
%1 Zhu2010
%A Zhu, Xingquan
%A Zhang, Peng
%A Lin, Xiaodong
%A Shi, Yong
%D 2010
%J IEEE transactions on systems, man, and cybernetics. Part B, Cybernetics
: a publication of the IEEE Systems, Man, and Cybernetics Society
%K imported
%P 4--7
%R 10.1109/TSMCB.2010.2042445
%T Active Learning From Stream Data Using Optimal Weight Classifier
Ensemble.
%U http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2865404&tool=pmcentrez&rendertype=abstract
%X In this paper, we propose a new research problem on active learning
from data streams, where data volumes grow continuously, and labeling
all data is considered expensive and impractical. The objective is
to label a small portion of stream data from which a model is derived
to predict future instances as accurately as possible. To tackle
the technical challenges raised by the dynamic nature of the stream
data, i.e., increasing data volumes and evolving decision concepts,
we propose a classifier-ensemble-based active learning framework
that selectively labels instances from data streams to build a classifier
ensemble. We argue that a classifier ensemble's variance directly
corresponds to its error rate, and reducing a classifier ensemble's
variance is equivalent to improving its prediction accuracy. Because
of this, one should label instances toward the minimization of the
variance of the underlying classifier ensemble. Accordingly, we introduce
a minimum-variance (MV) principle to guide the instance labeling
process for data streams. In addition, we derive an optimal-weight
calculation method to determine the weight values for the classifier
ensemble. The MV principle and the optimal weighting module are combined
to build an active learning framework for data streams. Experimental
results on synthetic and real-world data demonstrate the performance
of the proposed work in comparison with other approaches.
@article{Zhu2010,
abstract = {In this paper, we propose a new research problem on active learning
from data streams, where data volumes grow continuously, and labeling
all data is considered expensive and impractical. The objective is
to label a small portion of stream data from which a model is derived
to predict future instances as accurately as possible. To tackle
the technical challenges raised by the dynamic nature of the stream
data, i.e., increasing data volumes and evolving decision concepts,
we propose a classifier-ensemble-based active learning framework
that selectively labels instances from data streams to build a classifier
ensemble. We argue that a classifier ensemble's variance directly
corresponds to its error rate, and reducing a classifier ensemble's
variance is equivalent to improving its prediction accuracy. Because
of this, one should label instances toward the minimization of the
variance of the underlying classifier ensemble. Accordingly, we introduce
a minimum-variance (MV) principle to guide the instance labeling
process for data streams. In addition, we derive an optimal-weight
calculation method to determine the weight values for the classifier
ensemble. The MV principle and the optimal weighting module are combined
to build an active learning framework for data streams. Experimental
results on synthetic and real-world data demonstrate the performance
of the proposed work in comparison with other approaches.},
added-at = {2011-03-27T17:20:41.000+0200},
author = {Zhu, Xingquan and Zhang, Peng and Lin, Xiaodong and Shi, Yong},
biburl = {https://www.bibsonomy.org/bibtex/2a181a0b6fcac0ced77069c811e3f046d/yevb0},
doi = {10.1109/TSMCB.2010.2042445},
interhash = {f35bd5a4eb909b531f94d57757f9711f},
intrahash = {a181a0b6fcac0ced77069c811e3f046d},
issn = {1941-0492},
journal = {IEEE transactions on systems, man, and cybernetics. Part B, Cybernetics
: a publication of the IEEE Systems, Man, and Cybernetics Society},
keywords = {imported},
month = apr,
pages = {4--7},
pmid = {20363683},
timestamp = {2011-03-27T17:21:15.000+0200},
title = {Active Learning From Stream Data Using Optimal Weight Classifier
Ensemble.},
url = {http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=2865404\&tool=pmcentrez\&rendertype=abstract},
year = 2010
}