Zero-Shot Clickbait Spoiling by Rephrasing Titles as Questions
D. Wangsadirdja, J. Pfister, K. Kobs, and A. Hotho. Proceedings of the The 17th International Workshop on Semantic Evaluation (SemEval-2023), page 1090--1095. Toronto, Canada, Association for Computational Linguistics, (July 2023)
Abstract
In this paper, we describe our approach to the clickbait spoiling task of SemEval 2023. The core idea behind our system is to leverage pre-trained models capable of Question Answering (QA) to extract the spoiler from article texts based on the clickbait title without any task-specific training. Since oftentimes, these titles are not phrased as questions, we automatically rephrase the clickbait titles as questions in order to better suit the pretraining task of the QA-capable models. Also, to fit as much relevant context into the model's limited input size as possible, we propose to reorder the sentences by their relevance using a semantic similarity model. Finally, we evaluate QA as well as text generation models (via prompting) to extract the spoiler from the text.Based on the validation data, our final model selects each of these components depending on the spoiler type and achieves satisfactory zero-shot results. The ideas described in this paper can easily be applied in fine-tuning settings.
%0 Conference Paper
%1 wangsadirdja-etal-2023-jack
%A Wangsadirdja, Dirk
%A Pfister, Jan
%A Kobs, Konstantin
%A Hotho, Andreas
%B Proceedings of the The 17th International Workshop on Semantic Evaluation (SemEval-2023)
%C Toronto, Canada
%D 2023
%I Association for Computational Linguistics
%K app_nlp author:KOBS author:PFISTER clickbait from:janpf motiv myown research_llm zeroshot
%P 1090--1095
%T Zero-Shot Clickbait Spoiling by Rephrasing Titles as Questions
%U https://aclanthology.org/2023.semeval-1.150
%X In this paper, we describe our approach to the clickbait spoiling task of SemEval 2023. The core idea behind our system is to leverage pre-trained models capable of Question Answering (QA) to extract the spoiler from article texts based on the clickbait title without any task-specific training. Since oftentimes, these titles are not phrased as questions, we automatically rephrase the clickbait titles as questions in order to better suit the pretraining task of the QA-capable models. Also, to fit as much relevant context into the model's limited input size as possible, we propose to reorder the sentences by their relevance using a semantic similarity model. Finally, we evaluate QA as well as text generation models (via prompting) to extract the spoiler from the text.Based on the validation data, our final model selects each of these components depending on the spoiler type and achieves satisfactory zero-shot results. The ideas described in this paper can easily be applied in fine-tuning settings.
@inproceedings{wangsadirdja-etal-2023-jack,
abstract = {In this paper, we describe our approach to the clickbait spoiling task of SemEval 2023. The core idea behind our system is to leverage pre-trained models capable of Question Answering (QA) to extract the spoiler from article texts based on the clickbait title without any task-specific training. Since oftentimes, these titles are not phrased as questions, we automatically rephrase the clickbait titles as questions in order to better suit the pretraining task of the QA-capable models. Also, to fit as much relevant context into the model's limited input size as possible, we propose to reorder the sentences by their relevance using a semantic similarity model. Finally, we evaluate QA as well as text generation models (via prompting) to extract the spoiler from the text.Based on the validation data, our final model selects each of these components depending on the spoiler type and achieves satisfactory zero-shot results. The ideas described in this paper can easily be applied in fine-tuning settings.},
added-at = {2023-07-17T03:17:26.000+0200},
address = {Toronto, Canada},
author = {Wangsadirdja, Dirk and Pfister, Jan and Kobs, Konstantin and Hotho, Andreas},
biburl = {https://www.bibsonomy.org/bibtex/204474473b68fd5f99cffc411ae9aa246/dmir},
booktitle = {Proceedings of the The 17th International Workshop on Semantic Evaluation (SemEval-2023)},
interhash = {6683ecd06a359cb7aa5ec9022f677238},
intrahash = {04474473b68fd5f99cffc411ae9aa246},
keywords = {app_nlp author:KOBS author:PFISTER clickbait from:janpf motiv myown research_llm zeroshot},
month = jul,
pages = {1090--1095},
publisher = {Association for Computational Linguistics},
timestamp = {2024-01-18T10:31:52.000+0100},
title = {Zero-Shot Clickbait Spoiling by Rephrasing Titles as Questions},
url = {https://aclanthology.org/2023.semeval-1.150},
year = 2023
}