Abstract
When training a deep neural network for image classification, one can broadly
distinguish between two types of latent features of images that will drive the
classification. We can divide latent features into (i) "core" or "conditionally
invariant" features $X^core$ whose distribution $X^coreY$,
conditional on the class $Y$, does not change substantially across domains and
(ii) "style" features $X^style$ whose distribution $X^style
Y$ can change substantially across domains. Examples for style features
include position, rotation, image quality or brightness but also more complex
ones like hair color, image quality or posture for images of persons. Our goal
is to minimize a loss that is robust under changes in the distribution of these
style features. In contrast to previous work, we assume that the domain itself
is not observed and hence a latent variable.
We do assume that we can sometimes observe a typically discrete identifier or
"$ID$ variable". In some applications we know, for example, that two
images show the same person, and $ID$ then refers to the identity of
the person. The proposed method requires only a small fraction of images to
have $ID$ information. We group observations if they share the same
class and identifier $(Y,ID)=(y,id)$ and penalize the
conditional variance of the prediction or the loss if we condition on
$(Y,ID)$. Using a causal framework, this conditional variance
regularization (CoRe) is shown to protect asymptotically against shifts in the
distribution of the style variables. Empirically, we show that the CoRe penalty
improves predictive accuracy substantially in settings where domain changes
occur in terms of image quality, brightness and color while we also look at
more complex changes such as changes in movement and posture.
Description
[1710.11469] Conditional Variance Penalties and Domain Shift Robustness
Links and resources
Tags