Nov. 13, 2023, 2:10 a.m. | Nanna Inie, Jonathan Stray, Leon Derczynski

cs.CR updates on arXiv.org arxiv.org

Engaging in the deliberate generation of abnormal outputs from large language
models (LLMs) by attacking them is a novel human activity. This paper presents
a thorough exposition of how and why people perform such attacks. Using a
formal qualitative methodology, we interviewed dozens of practitioners from a
broad range of backgrounds, all contributors to this novel work of attempting
to cause LLMs to fail. We relate and connect this activity between its
practitioners' motivations and goals; the strategies and techniques …

attacks bind demon human language language models large llm llms methodology novel people qualitative red teaming theory

Social Engineer For Reverse Engineering Exploit Study

@ Independent study | Remote

Senior Software Engineer, Security

@ Niantic | Zürich, Switzerland

Consultant expert en sécurité des systèmes industriels (H/F)

@ Devoteam | Levallois-Perret, France

Cybersecurity Analyst

@ Bally's | Providence, Rhode Island, United States

Digital Trust Cyber Defense Executive

@ KPMG India | Gurgaon, Haryana, India

Program Manager - Cybersecurity Assessment Services

@ TestPros | Remote (and DMV), DC