Abstract
In this paper, we propose a novel multi-task learning architecture, which
incorporates recent advances in attention mechanisms. Our approach, the
Multi-Task Attention Network (MTAN), consists of a single shared network
containing a global feature pool, together with task-specific soft-attention
modules, which are trainable in an end-to-end manner. These attention modules
allow for learning of task-specific features from the global pool, whilst
simultaneously allowing for features to be shared across different tasks. The
architecture can be built upon any feed-forward neural network, is simple to
implement, and is parameter efficient. Experiments on the CityScapes dataset
show that our method outperforms several baselines in both single-task and
multi-task learning, and is also more robust to the various weighting schemes
in the multi-task loss function. We further explore the effectiveness of our
method through experiments over a range of task complexities, and show how our
method scales well with task complexity compared to baselines.
Users
Please
log in to take part in the discussion (add own reviews or comments).