This paper introduces the Deep Recurrent Attentive Writer (DRAW) neural
network architecture for image generation. DRAW networks combine a novel
spatial attention mechanism that mimics the foveation of the human eye, with a
sequential variational auto-encoding framework that allows for the iterative
construction of complex images. The system substantially improves on the state
of the art for generative models on MNIST, and, when trained on the Street View
House Numbers dataset, it generates images that cannot be distinguished from
real data with the naked eye.