Engineer friends often ask me: Graph Deep Learning sounds great, but are there any big commercial success stories? Is it being deployed in practical applications? Besides the obvious ones–recommendation systems at Pinterest, Alibaba and Twitter–a slightly nuanced success story is the Transformer architecture, which has taken the NLP industry by storm. Through this post, I want to establish links between Graph Neural Networks (GNNs) and Transformers. I’ll talk about the intuitions behind model architectures in the NLP and GNN communities, make connections using equations and figures, and discuss how we could work together to drive progress.
P. Xia, S. Wu, and B. Van Durme. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), page 7516--7533. Association for Computational Linguistics, (November 2020)
A. Kendall, and Y. Gal. Proceedings of the 31st International Conference on Neural Information Processing Systems, page 5580–5590. Red Hook, NY, USA, Curran Associates Inc., (2017)
X. Glorot, and Y. Bengio. Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, volume 9 of Proceedings of Machine Learning Research, page 249--256. Chia Laguna Resort, Sardinia, Italy, PMLR, (13--15 May 2010)