May 10, 2024, 4:11 a.m. | Xikang Yang, Xuehai Tang, Songlin Hu, Jizhong Han

cs.CR updates on arXiv.org arxiv.org

arXiv:2405.05610v1 Announce Type: cross
Abstract: Large language models (LLMs) have achieved remarkable performance in various natural language processing tasks, especially in dialogue systems. However, LLM may also pose security and moral threats, especially in multi round conversations where large models are more easily guided by contextual content, resulting in harmful or biased responses. In this paper, we present a novel method to attack LLMs in multi-turn dialogues, called CoA (Chain of Attack). CoA is a semantic-driven contextual multi-turn attack method …

arxiv attack attacker contextual conversations cs.cl cs.cr cs.lg language language models large llm llms may natural natural language natural language processing performance security semantic systems threats turn

CyberSOC Technical Lead

@ Integrity360 | Sandyford, Dublin, Ireland

Cyber Security Strategy Consultant

@ Capco | New York City

Cyber Security Senior Consultant

@ Capco | Chicago, IL

Sr. Product Manager

@ MixMode | Remote, US

Corporate Intern - Information Security (Year Round)

@ Associated Bank | US WI Remote

Senior Offensive Security Engineer

@ CoStar Group | US-DC Washington, DC