Аннотация
Machine learning is currently dominated by largely experimental work focused
on improvements in a few key tasks. However, the impressive accuracy numbers of
the best performing models are questionable because the same test sets have
been used to select these models for multiple years now. To understand the
danger of overfitting, we measure the accuracy of CIFAR-10 classifiers by
creating a new test set of truly unseen images. Although we ensure that the new
test set is as close to the original data distribution as possible, we find a
large drop in accuracy (4% to 10%) for a broad range of deep learning models.
Yet more recent models with higher original accuracy show a smaller drop and
better overall performance, indicating that this drop is likely not due to
overfitting based on adaptivity. Instead, we view our results as evidence that
current accuracy numbers are brittle and susceptible to even minute natural
variations in the data distribution.
Пользователи данного ресурса
Пожалуйста,
войдите в систему, чтобы принять участие в дискуссии (добавить собственные рецензию, или комментарий)