Article,

A comparison of the performance of threshold criteria for binary classification in terms of predicted prevalence and kappa

E. Freeman, and G. Moisen.
Ecological Modelling, 217 (1–2): 48 - 58 (2008)
DOI: 10.1016/j.ecolmodel.2008.05.015

Abstract

Modelling techniques used in binary classification problems often result in a predicted probability surface, which is then translated into a presence–absence classification map. However, this translation requires a (possibly subjective) choice of threshold above which the variable of interest is predicted to be present. The selection of this threshold value can have dramatic effects on model accuracy as well as the predicted prevalence for the variable (the overall proportion of locations where the variable is predicted to be present). The traditional default is to simply use a threshold of 0.5 as the cut-off, but this does not necessarily preserve the observed prevalence or result in the highest prediction accuracy, especially for data sets with very high or very low observed prevalence. Alternatively, the thresholds can be chosen to optimize map accuracy, as judged by various criteria. Here we examine the effect of 11 of these potential criteria on predicted prevalence, prediction accuracy, and the resulting map output. Comparisons are made using output from presence–absence models developed for 13 tree species in the northern mountains of Utah. We found that species with poor model quality or low prevalence were most sensitive to the choice of threshold. For these species, a 0.5 cut-off was unreliable, sometimes resulting in substantially lower kappa and underestimated prevalence, with possible detrimental effects on a management decision. If a management objective requires a map to portray unbiased estimates of species prevalence, then the best results were obtained from thresholds deliberately chosen so that the predicted prevalence equaled the observed prevalence, followed closely by thresholds chosen to maximize kappa. These were also the two criteria with the highest mean kappa from our independent test data. For particular management applications the special cases of user specified required accuracy may be most appropriate. Ultimately, maps will typically have multiple and somewhat conflicting management applications. Therefore, providing users with a continuous probability surface may be the most versatile and powerful method, allowing threshold choice to be matched with each maps intended use.

BibTeX key: Freeman200848
entry type: article
year: 2008
journal: Ecological Modelling
number: 1–2
pages: 48 - 58
volume: 217
issn: 0304-3800
DOI: 10.1016/j.ecolmodel.2008.05.015
url: http://www.sciencedirect.com/science/article/pii/S0304380008002275

Users

Comments and Reviewsshow / hide

Please log in to take part in the discussion (add own reviews or comments).

Cite this publication

%0 Journal Article %1 Freeman200848 %A Freeman, Elizabeth A. %A Moisen, Gretchen G. %D 2008 %J Ecological Modelling %K classification classifier predection %N 1–2 %P 48 - 58 %R 10.1016/j.ecolmodel.2008.05.015 %T A comparison of the performance of threshold criteria for binary classification in terms of predicted prevalence and kappa %U http://www.sciencedirect.com/science/article/pii/S0304380008002275 %V 217 %X Modelling techniques used in binary classification problems often result in a predicted probability surface, which is then translated into a presence–absence classification map. However, this translation requires a (possibly subjective) choice of threshold above which the variable of interest is predicted to be present. The selection of this threshold value can have dramatic effects on model accuracy as well as the predicted prevalence for the variable (the overall proportion of locations where the variable is predicted to be present). The traditional default is to simply use a threshold of 0.5 as the cut-off, but this does not necessarily preserve the observed prevalence or result in the highest prediction accuracy, especially for data sets with very high or very low observed prevalence. Alternatively, the thresholds can be chosen to optimize map accuracy, as judged by various criteria. Here we examine the effect of 11 of these potential criteria on predicted prevalence, prediction accuracy, and the resulting map output. Comparisons are made using output from presence–absence models developed for 13 tree species in the northern mountains of Utah. We found that species with poor model quality or low prevalence were most sensitive to the choice of threshold. For these species, a 0.5 cut-off was unreliable, sometimes resulting in substantially lower kappa and underestimated prevalence, with possible detrimental effects on a management decision. If a management objective requires a map to portray unbiased estimates of species prevalence, then the best results were obtained from thresholds deliberately chosen so that the predicted prevalence equaled the observed prevalence, followed closely by thresholds chosen to maximize kappa. These were also the two criteria with the highest mean kappa from our independent test data. For particular management applications the special cases of user specified required accuracy may be most appropriate. Ultimately, maps will typically have multiple and somewhat conflicting management applications. Therefore, providing users with a continuous probability surface may be the most versatile and powerful method, allowing threshold choice to be matched with each maps intended use.

@article{Freeman200848, abstract = {Modelling techniques used in binary classification problems often result in a predicted probability surface, which is then translated into a presence–absence classification map. However, this translation requires a (possibly subjective) choice of threshold above which the variable of interest is predicted to be present. The selection of this threshold value can have dramatic effects on model accuracy as well as the predicted prevalence for the variable (the overall proportion of locations where the variable is predicted to be present). The traditional default is to simply use a threshold of 0.5 as the cut-off, but this does not necessarily preserve the observed prevalence or result in the highest prediction accuracy, especially for data sets with very high or very low observed prevalence. Alternatively, the thresholds can be chosen to optimize map accuracy, as judged by various criteria. Here we examine the effect of 11 of these potential criteria on predicted prevalence, prediction accuracy, and the resulting map output. Comparisons are made using output from presence–absence models developed for 13 tree species in the northern mountains of Utah. We found that species with poor model quality or low prevalence were most sensitive to the choice of threshold. For these species, a 0.5 cut-off was unreliable, sometimes resulting in substantially lower kappa and underestimated prevalence, with possible detrimental effects on a management decision. If a management objective requires a map to portray unbiased estimates of species prevalence, then the best results were obtained from thresholds deliberately chosen so that the predicted prevalence equaled the observed prevalence, followed closely by thresholds chosen to maximize kappa. These were also the two criteria with the highest mean kappa from our independent test data. For particular management applications the special cases of user specified required accuracy may be most appropriate. Ultimately, maps will typically have multiple and somewhat conflicting management applications. Therefore, providing users with a continuous probability surface may be the most versatile and powerful method, allowing threshold choice to be matched with each maps intended use.}, added-at = {2012-06-08T17:10:32.000+0200}, author = {Freeman, Elizabeth A. and Moisen, Gretchen G.}, biburl = {https://www.bibsonomy.org/bibtex/284d15078f49056cbc06b44315deb10ac/iyas_hilal}, description = {ScienceDirect.com - Ecological Modelling - A comparison of the performance of threshold criteria for binary classification in terms of predicted prevalence and kappa}, doi = {10.1016/j.ecolmodel.2008.05.015}, interhash = {025e153a4065bfbd3b4c193a40b2127a}, intrahash = {84d15078f49056cbc06b44315deb10ac}, issn = {0304-3800}, journal = {Ecological Modelling}, keywords = {classification classifier predection}, number = {1–2}, pages = {48 - 58}, timestamp = {2012-06-08T17:10:32.000+0200}, title = {A comparison of the performance of threshold criteria for binary classification in terms of predicted prevalence and kappa}, url = {http://www.sciencedirect.com/science/article/pii/S0304380008002275}, volume = 217, year = 2008 }

BibSonomy

A comparison of the performance of threshold criteria for binary classification in terms of predicted prevalence and kappa

Abstract

Tags

Users

Comments and Reviewsshow / hide

Cite this publication

More citation styles

search on