Long Short-Term Memory (LSTM) is a popular approach to boosting the ability
of Recurrent Neural Networks to store longer term temporal information. The
capacity of an LSTM network can be increased by widening and adding layers.
However, usually the former introduces additional parameters, while the latter
increases the runtime. As an alternative we propose the Tensorized LSTM in
which the hidden states are represented by tensors and updated via a
cross-layer convolution. By increasing the tensor size, the network can be
widened efficiently without additional parameters since the parameters are
shared across different locations in the tensor; by delaying the output, the
network can be deepened implicitly with little additional runtime since deep
computations for each timestep are merged into temporal computations of the
sequence. Experiments conducted on five challenging sequence learning tasks
show the potential of the proposed model.
Description
Wider and Deeper, Cheaper and Faster: Tensorized LSTMs for Sequence
Learning
%0 Generic
%1 he2017wider
%A He, Zhen
%A Gao, Shaobing
%A Xiao, Liang
%A Liu, Daxue
%A He, Hangen
%A Barber, David
%D 2017
%K 2017 LSTM arxiv deep-learning paper tensorflow
%T Wider and Deeper, Cheaper and Faster: Tensorized LSTMs for Sequence
Learning
%U http://arxiv.org/abs/1711.01577
%X Long Short-Term Memory (LSTM) is a popular approach to boosting the ability
of Recurrent Neural Networks to store longer term temporal information. The
capacity of an LSTM network can be increased by widening and adding layers.
However, usually the former introduces additional parameters, while the latter
increases the runtime. As an alternative we propose the Tensorized LSTM in
which the hidden states are represented by tensors and updated via a
cross-layer convolution. By increasing the tensor size, the network can be
widened efficiently without additional parameters since the parameters are
shared across different locations in the tensor; by delaying the output, the
network can be deepened implicitly with little additional runtime since deep
computations for each timestep are merged into temporal computations of the
sequence. Experiments conducted on five challenging sequence learning tasks
show the potential of the proposed model.
@misc{he2017wider,
abstract = {Long Short-Term Memory (LSTM) is a popular approach to boosting the ability
of Recurrent Neural Networks to store longer term temporal information. The
capacity of an LSTM network can be increased by widening and adding layers.
However, usually the former introduces additional parameters, while the latter
increases the runtime. As an alternative we propose the Tensorized LSTM in
which the hidden states are represented by tensors and updated via a
cross-layer convolution. By increasing the tensor size, the network can be
widened efficiently without additional parameters since the parameters are
shared across different locations in the tensor; by delaying the output, the
network can be deepened implicitly with little additional runtime since deep
computations for each timestep are merged into temporal computations of the
sequence. Experiments conducted on five challenging sequence learning tasks
show the potential of the proposed model.},
added-at = {2018-04-12T19:13:06.000+0200},
author = {He, Zhen and Gao, Shaobing and Xiao, Liang and Liu, Daxue and He, Hangen and Barber, David},
biburl = {https://www.bibsonomy.org/bibtex/21e59328ae2feb275a8eded549400db9f/achakraborty},
description = {Wider and Deeper, Cheaper and Faster: Tensorized LSTMs for Sequence
Learning},
interhash = {8a0e20bc5fb8b62af0b8cb6eabe186b1},
intrahash = {1e59328ae2feb275a8eded549400db9f},
keywords = {2017 LSTM arxiv deep-learning paper tensorflow},
note = {cite arxiv:1711.01577Comment: Accepted by NIPS 2017},
timestamp = {2018-04-12T19:13:06.000+0200},
title = {Wider and Deeper, Cheaper and Faster: Tensorized LSTMs for Sequence
Learning},
url = {http://arxiv.org/abs/1711.01577},
year = 2017
}