Deep generative models have been enjoying success in modeling continuous
data. However it remains challenging to capture the representations for
discrete structures with formal grammars and semantics, e.g., computer programs
and molecular structures. How to generate both syntactically and semantically
correct data still remains largely an open problem. Inspired by the theory of
compiler where the syntax and semantics check is done via syntax-directed
translation (SDT), we propose a novel syntax-directed variational autoencoder
(SD-VAE) by introducing stochastic lazy attributes. This approach converts the
offline SDT check into on-the-fly generated guidance for constraining the
decoder. Comparing to the state-of-the-art methods, our approach enforces
constraints on the output space so that the output will be not only
syntactically valid, but also semantically reasonable. We evaluate the proposed
model with applications in programming language and molecules, including
reconstruction and program/molecule optimization. The results demonstrate the
effectiveness in incorporating syntactic and semantic constraints in discrete
generative models, which is significantly better than current state-of-the-art
approaches.
Description
Syntax-Directed Variational Autoencoder for Structured Data
%0 Generic
%1 dai2018syntaxdirected
%A Dai, Hanjun
%A Tian, Yingtao
%A Dai, Bo
%A Skiena, Steven
%A Song, Le
%D 2018
%K autoencoder to_read unsupervised variational-ae
%T Syntax-Directed Variational Autoencoder for Structured Data
%U http://arxiv.org/abs/1802.08786
%X Deep generative models have been enjoying success in modeling continuous
data. However it remains challenging to capture the representations for
discrete structures with formal grammars and semantics, e.g., computer programs
and molecular structures. How to generate both syntactically and semantically
correct data still remains largely an open problem. Inspired by the theory of
compiler where the syntax and semantics check is done via syntax-directed
translation (SDT), we propose a novel syntax-directed variational autoencoder
(SD-VAE) by introducing stochastic lazy attributes. This approach converts the
offline SDT check into on-the-fly generated guidance for constraining the
decoder. Comparing to the state-of-the-art methods, our approach enforces
constraints on the output space so that the output will be not only
syntactically valid, but also semantically reasonable. We evaluate the proposed
model with applications in programming language and molecules, including
reconstruction and program/molecule optimization. The results demonstrate the
effectiveness in incorporating syntactic and semantic constraints in discrete
generative models, which is significantly better than current state-of-the-art
approaches.
@misc{dai2018syntaxdirected,
abstract = {Deep generative models have been enjoying success in modeling continuous
data. However it remains challenging to capture the representations for
discrete structures with formal grammars and semantics, e.g., computer programs
and molecular structures. How to generate both syntactically and semantically
correct data still remains largely an open problem. Inspired by the theory of
compiler where the syntax and semantics check is done via syntax-directed
translation (SDT), we propose a novel syntax-directed variational autoencoder
(SD-VAE) by introducing stochastic lazy attributes. This approach converts the
offline SDT check into on-the-fly generated guidance for constraining the
decoder. Comparing to the state-of-the-art methods, our approach enforces
constraints on the output space so that the output will be not only
syntactically valid, but also semantically reasonable. We evaluate the proposed
model with applications in programming language and molecules, including
reconstruction and program/molecule optimization. The results demonstrate the
effectiveness in incorporating syntactic and semantic constraints in discrete
generative models, which is significantly better than current state-of-the-art
approaches.},
added-at = {2018-02-27T08:05:44.000+0100},
author = {Dai, Hanjun and Tian, Yingtao and Dai, Bo and Skiena, Steven and Song, Le},
biburl = {https://www.bibsonomy.org/bibtex/2150ecc69c46fd594f21e19c0a0ca49c0/jk_itwm},
description = {Syntax-Directed Variational Autoencoder for Structured Data},
interhash = {fd4b6fb42458517b2e7377a762307fef},
intrahash = {150ecc69c46fd594f21e19c0a0ca49c0},
keywords = {autoencoder to_read unsupervised variational-ae},
note = {cite arxiv:1802.08786Comment: to appear in ICLR 2018},
timestamp = {2018-02-27T08:05:44.000+0100},
title = {Syntax-Directed Variational Autoencoder for Structured Data},
url = {http://arxiv.org/abs/1802.08786},
year = 2018
}