Author of the publication

The (ab)use of Open Source Code to Train Large Language Models.

, and . NLBSE@ICSE, page 9-10. IEEE, (2023)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Predicting the Objective and Priority of Issue Reports in a Cross project Context., , and . CoRR, (2020)Long Code Arena: a Set of Benchmarks for Long-Context Code Models., , , , , , , , , and 1 other author(s). CoRR, (2024)An Exploratory Investigation into Code License Infringements in Large Language Model Training Datasets., , , and . FORGE, page 74-85. ACM, (2024)Targeted Attack on GPT-Neo for the SATML Language Model Data Extraction Challenge., , and . CoRR, (2023)Extending Source Code Pre-Trained Language Models to Summarise Decompiled Binarie., , , , , and . SANER, page 260-271. IEEE, (2023)Enriching Source Code with Contextual Data for Code Completion Models: An Empirical Study., , and . MSR, page 170-182. IEEE, (2023)Language Models for Code Completion: A Practical Evaluation., , , , , and . ICSE, page 79:1-79:13. ACM, (2024)Traces of Memorisation in Large Language Models for Code., , and . ICSE, page 78:1-78:12. ACM, (2024)CatIss: An Intelligent Tool for Categorizing Issues Reports using Transformers.. NLBSE, page 44-47. ACM/IEEE, (2022)STACC: Code Comment Classification using SentenceTransformers., , and . NLBSE@ICSE, page 28-31. IEEE, (2023)