Abstract
We provide excess risk guarantees for statistical learning in a setting where
the population risk with respect to which we evaluate the target model depends
on an unknown model that must be to be estimated from data (a "nuisance
model"). We analyze a two-stage sample splitting meta-algorithm that takes as
input two arbitrary estimation algorithms: one for the target model and one for
the nuisance model. We show that if the population risk satisfies a condition
called Neyman orthogonality, the impact of the nuisance estimation error on the
excess risk bound achieved by the meta-algorithm is of second order. Our
theorem is agnostic to the particular algorithms used for the target and
nuisance and only makes an assumption on their individual performance. This
enables the use of a plethora of existing results from statistical learning and
machine learning literature to give new guarantees for learning with a nuisance
component. Moreover, by focusing on excess risk rather than parameter
estimation, we can give guarantees under weaker assumptions than in previous
works and accommodate the case where the target parameter belongs to a complex
nonparametric class. We characterize conditions on the metric entropy such that
oracle rates---rates of the same order as if we knew the nuisance model---are
achieved. We also analyze the rates achieved by specific estimation algorithms
such as variance-penalized empirical risk minimization, neural network
estimation and sparse high-dimensional linear model estimation. We highlight
the applicability of our results in four settings of central importance in the
literature: 1) heterogeneous treatment effect estimation, 2) offline policy
optimization, 3) domain adaptation, and 4) learning with missing data.
Users
Please
log in to take part in the discussion (add own reviews or comments).