Author of the publication

Please choose a person to relate this publication to

To differ between persons with the same name, the academic degree and the title of an important publication will be displayed. You can also use the button next to the name to display some publications already assigned to the person.

 

Other publications of authors with the same name

Safeguarding Vision-Language Models Against Patched Visual Prompt Injectors., , , , and . CoRR, (2024)A Critical Revisit of Adversarial Robustness in 3D Point Cloud Recognition with Diffusion-Driven Purification., , , , , and . ICML, volume 202 of Proceedings of Machine Learning Research, page 33100-33114. PMLR, (2023)Test-time Backdoor Mitigation for Black-Box Large Language Models with Defensive Demonstrations., , , , , , and . CoRR, (2023)Fast and Reliable Evaluation of Adversarial Robustness with Minimum-Margin Attack., , , , , , , and . ICML, volume 162 of Proceedings of Machine Learning Research, page 7144-7163. PMLR, (2022)On the Exploitability of Reinforcement Learning with Human Feedback for Large Language Models., , , , and . CoRR, (2023)RLHFPoison: Reward Poisoning Attack for Reinforcement Learning with Human Feedback in Large Language Models., , , , and . ACL (1), page 2551-2570. Association for Computational Linguistics, (2024)Defending against Adversarial Audio via Diffusion Model., , , , and . ICLR, OpenReview.net, (2023)Consistency Purification: Effective and Efficient Diffusion Purification towards Certified Robustness., , , , , and . CoRR, (2024)Preference Poisoning Attacks on Reward Model Learning., , , , , and . CoRR, (2024)Adversarial Demonstration Attacks on Large Language Models., , , , and . CoRR, (2023)