Author of the publication

ToolSword: Unveiling Safety Issues of Large Language Models in Tool Learning Across Three Stages.

, , , , , , , , and . ACL (1), page 2181-2211. Association for Computational Linguistics, (2024)

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Beyond Boundaries: Learning a Universal Entity Taxonomy across Datasets and Languages for Open Named Entity Recognition., , , , , , , , , and 4 other author(s). CoRR, (2024)EasyJailbreak: A Unified Framework for Jailbreaking Large Language Models., , , , , , , , , and 11 other author(s). CoRR, (2024)CodeChameleon: Personalized Encryption Framework for Jailbreaking Large Language Models., , , , , , , , and . CoRR, (2024)StepCoder: Improve Code Generation with Reinforcement Learning from Compiler Feedback., , , , , , , , , and 7 other author(s). CoRR, (2024)Secrets of RLHF in Large Language Models Part II: Reward Modeling., , , , , , , , , and 17 other author(s). CoRR, (2024)What's Wrong with Your Code Generated by Large Language Models? An Extensive Study., , , , , , , , , and 14 other author(s). CoRR, (2024)StepCoder: Improving Code Generation with Reinforcement Learning from Compiler Feedback., , , , , , , , , and 6 other author(s). ACL (1), page 4571-4585. Association for Computational Linguistics, (2024)SafeAligner: Safety Alignment against Jailbreak Attacks via Response Disparity Guidance., , , , , , , , , and 3 other author(s). CoRR, (2024)ToolSword: Unveiling Safety Issues of Large Language Models in Tool Learning Across Three Stages., , , , , , , , and . ACL (1), page 2181-2211. Association for Computational Linguistics, (2024)TransferTOD: A Generalizable Chinese Multi-Domain Task-Oriented Dialogue System with Transfer Capabilities., , , , , , , , , and 3 other author(s). CoRR, (2024)