Аннотация
Categorical variables are a natural choice for representing discrete
structure in the world. However, stochastic neural networks rarely use
categorical latent variables due to the inability to backpropagate through
samples. In this work, we present an efficient gradient estimator that replaces
the non-differentiable sample from a categorical distribution with a
differentiable sample from a novel Gumbel-Softmax distribution. This
distribution has the essential property that it can be smoothly annealed into a
categorical distribution. We show that our Gumbel-Softmax estimator outperforms
state-of-the-art gradient estimators on structured output prediction and
unsupervised generative modeling tasks with categorical latent variables, and
enables large speedups on semi-supervised classification.
Пользователи данного ресурса
Пожалуйста,
войдите в систему, чтобы принять участие в дискуссии (добавить собственные рецензию, или комментарий)