Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
N. Reimers, and I. Gurevych. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), page 3982--3992. Hong Kong, China, Association for Computational Linguistics, (November 2019)
DOI: 10.18653/v1/D19-1410
Abstract
BERT (Devlin et al., 2018) and RoBERTa (Liu et al., 2019) has set a new state-of-the-art performance on sentence-pair regression tasks like semantic textual similarity (STS). However, it requires that both sentences are fed into the network, which causes a massive computational overhead: Finding the most similar pair in a collection of 10,000 sentences requires about 50 million inference computations (˜65 hours) with BERT. The construction of BERT makes it unsuitable for semantic similarity search as well as for unsupervised tasks like clustering. In this publication, we present Sentence-BERT (SBERT), a modification of the pretrained BERT network that use siamese and triplet network structures to derive semantically meaningful sentence embeddings that can be compared using cosine-similarity. This reduces the effort for finding the most similar pair from 65 hours with BERT / RoBERTa to about 5 seconds with SBERT, while maintaining the accuracy from BERT. We evaluate SBERT and SRoBERTa on common STS tasks and transfer learning tasks, where it outperforms other state-of-the-art sentence embeddings methods.
Description
Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks - ACL Anthology
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
%0 Conference Paper
%1 reimers-gurevych-2019-sentence
%A Reimers, Nils
%A Gurevych, Iryna
%B Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
%C Hong Kong, China
%D 2019
%I Association for Computational Linguistics
%K bert cosine-similarity embeddings sbert siamese-network
%P 3982--3992
%R 10.18653/v1/D19-1410
%T Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
%U https://aclanthology.org/D19-1410
%X BERT (Devlin et al., 2018) and RoBERTa (Liu et al., 2019) has set a new state-of-the-art performance on sentence-pair regression tasks like semantic textual similarity (STS). However, it requires that both sentences are fed into the network, which causes a massive computational overhead: Finding the most similar pair in a collection of 10,000 sentences requires about 50 million inference computations (˜65 hours) with BERT. The construction of BERT makes it unsuitable for semantic similarity search as well as for unsupervised tasks like clustering. In this publication, we present Sentence-BERT (SBERT), a modification of the pretrained BERT network that use siamese and triplet network structures to derive semantically meaningful sentence embeddings that can be compared using cosine-similarity. This reduces the effort for finding the most similar pair from 65 hours with BERT / RoBERTa to about 5 seconds with SBERT, while maintaining the accuracy from BERT. We evaluate SBERT and SRoBERTa on common STS tasks and transfer learning tasks, where it outperforms other state-of-the-art sentence embeddings methods.
@inproceedings{reimers-gurevych-2019-sentence,
abstract = {BERT (Devlin et al., 2018) and RoBERTa (Liu et al., 2019) has set a new state-of-the-art performance on sentence-pair regression tasks like semantic textual similarity (STS). However, it requires that both sentences are fed into the network, which causes a massive computational overhead: Finding the most similar pair in a collection of 10,000 sentences requires about 50 million inference computations ({\textasciitilde}65 hours) with BERT. The construction of BERT makes it unsuitable for semantic similarity search as well as for unsupervised tasks like clustering. In this publication, we present Sentence-BERT (SBERT), a modification of the pretrained BERT network that use siamese and triplet network structures to derive semantically meaningful sentence embeddings that can be compared using cosine-similarity. This reduces the effort for finding the most similar pair from 65 hours with BERT / RoBERTa to about 5 seconds with SBERT, while maintaining the accuracy from BERT. We evaluate SBERT and SRoBERTa on common STS tasks and transfer learning tasks, where it outperforms other state-of-the-art sentence embeddings methods.},
added-at = {2022-07-25T09:03:11.000+0200},
address = {Hong Kong, China},
author = {Reimers, Nils and Gurevych, Iryna},
biburl = {https://www.bibsonomy.org/bibtex/214aa02fa699581c4d954daaf7cb07b06/mschwab},
booktitle = {Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)},
description = {Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks - ACL Anthology},
doi = {10.18653/v1/D19-1410},
interhash = {762e8dacdc867460a7b200c6b4cd1b5c},
intrahash = {14aa02fa699581c4d954daaf7cb07b06},
keywords = {bert cosine-similarity embeddings sbert siamese-network},
month = nov,
pages = {3982--3992},
publisher = {Association for Computational Linguistics},
timestamp = {2022-07-25T09:03:11.000+0200},
title = {Sentence-{BERT}: Sentence Embeddings using {S}iamese {BERT}-Networks},
url = {https://aclanthology.org/D19-1410},
year = 2019
}