Abstract
Faced with distribution shift between training and test set, we wish to
detect and quantify the shift, and to correct our classifiers without test set
labels. Motivated by medical diagnosis, where diseases (targets) cause symptoms
(observations), we focus on label shift, where the label marginal $p(y)$
changes but the conditional $p(x| y)$ does not. We propose Black Box Shift
Estimation (BBSE) to estimate the test distribution $p(y)$. BBSE exploits
arbitrary black box predictors to reduce dimensionality prior to shift
correction. While better predictors give tighter estimates, BBSE works even
when predictors are biased, inaccurate, or uncalibrated, so long as their
confusion matrices are invertible. We prove BBSE's consistency, bound its
error, and introduce a statistical test that uses BBSE to detect shift. We also
leverage BBSE to correct classifiers. Experiments demonstrate accurate
estimates and improved prediction, even on high-dimensional datasets of natural
images.
Users
Please
log in to take part in the discussion (add own reviews or comments).