Language evolves over time in many ways relevant to natural language
processing tasks. For example, recent occurrences of tokens 'BERT' and 'ELMO'
in publications refer to neural network architectures rather than persons. This
type of temporal signal is typically overlooked, but is important if one aims
to deploy a machine learning model over an extended period of time. In
particular, language evolution causes data drift between time-steps in
sequential decision-making tasks. Examples of such tasks include prediction of
paper acceptance for yearly conferences (regular intervals) or author stance
prediction for rumours on Twitter (irregular intervals). Inspired by successes
in computer vision, we tackle data drift by sequentially aligning learned
representations. We evaluate on three challenging tasks varying in terms of
time-scales, linguistic units, and domains. These tasks show our method
outperforming several strong baselines, including using all available data. We
argue that, due to its low computational expense, sequential alignment is a
practical solution to dealing with language evolution.