Article,

On bias plus variance

D. Wolpert.
Neural Computation, 9 (6): 1211--1243 (1997)
DOI: http://dx.doi.org/10.1162/neco.1997.9.6.1211

Abstract

This article presents several additive corrections to the conventional quadratic loss bias-plus-variance formula. One of these corrections is appropriate when both the target is not fixed (as in Bayesian analysis) and training sets are averaged over (as in the conventional bias plus variance formula). Another additive correction casts conventional fixed-trainingset Bayesian analysis directly in terms of bias plus variance. Another correction is appropriate for measuring full generalization error over a test set rather than (as with conventional bias plus variance) error at a single point. Yet another correction can help explain the recent counterintuitive bias-variance decomposition of Friedman for zero-one loss. After presenting these corrections, this article discusses some other loss function-specific aspects of supervised learning. In particular, there is a discussion of the fact that if the loss function is a metric (e.g., zero-one loss), then there is bound on the change in generalization error accompanying changing the algorithm's guess from h1 to h2, a bound that depends only on h1 and h2 and not on the target. This article ends by presenting versions of the bias-plus-variance formula appropriate for logarithmic and quadratic scoring, and then all the additive corrections appropriate to those formulas. All the correction terms presented are a covariance, between the learning algorithm and the posterior distribution over targets. Accordingly, in the (very common) contexts in which those terms apply, there is not a “bias-variance trade-off” or a “bias-variance dilemma,” as one often hears. Rather there is a bias-variance-covariance trade-off.

BibTeX key: Wolpert1997a
entry type: article
address: Cambridge, MA, USA
year: 1997
journal: Neural Computation
number: 6
pages: 1211--1243
publisher: MIT Press
volume: 9
timestamp: 2009.07.07
username: jabreftest
issn: 0899-7667
file: Wolpert1997a.pdf:1997/Wolpert1997a.pdf:PDF
groups: public
DOI: http://dx.doi.org/10.1162/neco.1997.9.6.1211

Users

Comments and Reviewsshow / hide

Please log in to take part in the discussion (add own reviews or comments).

Cite this publication

@article{Wolpert1997a, abstract = {This article presents several additive corrections to the conventional quadratic loss bias-plus-variance formula. One of these corrections is appropriate when both the target is not fixed (as in Bayesian analysis) and training sets are averaged over (as in the conventional bias plus variance formula). Another additive correction casts conventional fixed-trainingset Bayesian analysis directly in terms of bias plus variance. Another correction is appropriate for measuring full generalization error over a test set rather than (as with conventional bias plus variance) error at a single point. Yet another correction can help explain the recent counterintuitive bias-variance decomposition of Friedman for zero-one loss. After presenting these corrections, this article discusses some other loss function-specific aspects of supervised learning. In particular, there is a discussion of the fact that if the loss function is a metric (e.g., zero-one loss), then there is bound on the change in generalization error accompanying changing the algorithm's guess from h1 to h2, a bound that depends only on h1 and h2 and not on the target. This article ends by presenting versions of the bias-plus-variance formula appropriate for logarithmic and quadratic scoring, and then all the additive corrections appropriate to those formulas. All the correction terms presented are a covariance, between the learning algorithm and the posterior distribution over targets. Accordingly, in the (very common) contexts in which those terms apply, there is not a “bias-variance trade-off” or a “bias-variance dilemma,” as one often hears. Rather there is a bias-variance-covariance trade-off.}, added-at = {2012-07-13T11:59:10.000+0200}, address = {Cambridge, MA, USA}, author = {Wolpert, David H.}, biburl = {https://www.bibsonomy.org/bibtex/2dbf749a18f5507adbbd64e5906686ee0/jabreftest}, doi = {http://dx.doi.org/10.1162/neco.1997.9.6.1211}, file = {Wolpert1997a.pdf:1997/Wolpert1997a.pdf:PDF}, groups = {public}, interhash = {4755c28d93ffc29cebe5b31bc567f703}, intrahash = {dbf749a18f5507adbbd64e5906686ee0}, issn = {0899-7667}, journal = {Neural Computation}, keywords = {}, number = 6, pages = {1211--1243}, publisher = {MIT Press}, timestamp = {2012-07-13T11:59:10.000+0200}, title = {On bias plus variance}, username = {jabreftest}, volume = 9, year = 1997 }

BibSonomy

On bias plus variance

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on