Tensor Trust: Interpretable Prompt Injection Attacks from an Online Game. (arXiv:2311.01011v1 [cs.LG]) | allinfosecnews.com

Nov. 5, 2023, 6:10 a.m. | Sam Toyer, Olivia Watkins, Ethan Adrian Mendes, Justin Svegliato, Luke Bailey, Tiffany Wang, Isaac Ong, Karim Elmaaroufi, Pieter Abbeel, Trevor Darrel

cs.CR updates on arXiv.org arxiv.org

While Large Language Models (LLMs) are increasingly being used in real-world
applications, they remain vulnerable to prompt injection attacks: malicious
third party prompts that subvert the intent of the system designer. To help
researchers study this problem, we present a dataset of over 126,000 prompt
injection attacks and 46,000 prompt-based "defenses" against prompt injection,
all created by players of an online game called Tensor Trust. To the best of
our knowledge, this is currently the largest dataset of human-generated
adversarial …

applications attacks dataset designer game injection injection attacks intent language language models large llms malicious party problem prompt prompt injection prompt injection attacks prompts real researchers study system tensor third trust vulnerable world

More from arxiv.org / cs.CR updates on arXiv.org

Causal Inference with Differentially Private (Clustered) Outcomes 14 hours ago | arxiv.org

algorithm arxiv cs.cr cs.lg +12

An artificial neural network approach to finding the key length of the Vigen\`{e}re cipher 14 hours ago | arxiv.org

accuracy article artificial arxiv +9

Generic Selfish Mining MDP for DAG Protocols 14 hours ago | arxiv.org

analysis arxiv bitcoin breaking +15

Tight Differential Privacy Guarantees for the Shuffle Model with $k$-Randomized Response 14 hours ago | arxiv.org

algorithms arxiv cs.cr data +14

Succinct arguments for QMA from standard assumptions via compiled nonlocal games 14 hours ago | arxiv.org

argument arxiv building crypto +8

On Training a Neural Network to Explain Binaries 14 hours ago | arxiv.org

aid arxiv binary code +15

Transferring Troubles: Cross-Lingual Transferability of Backdoor Attacks in LLMs with Instruction Tuning 14 hours ago | arxiv.org

arxiv attacks backdoor backdoor attacks +14

Leveraging Label Information for Stealthy Data Stealing in Vertical Federated Learning 14 hours ago | arxiv.org

arxiv attack attacks cs.cr +16

An Extensive Survey of Digital Image Steganography: State of the Art 14 hours ago | arxiv.org

adoption art arxiv attention +21

EY- GDS- Cybersecurity- Staff

@ EY | Miguel Hidalgo, MX, 11520

View on infosec-jobs.com

Staff Security Operations Engineer

@ Workiva | Ames

View on infosec-jobs.com

Public Relations Senior Account Executive (B2B Tech/Cybersecurity/Enterprise)

@ Highwire Public Relations | Los Angeles, CA

View on infosec-jobs.com

Airbus Canada - Responsable Cyber sécurité produit / Product Cyber Security Responsible

@ Airbus | Mirabel

View on infosec-jobs.com

Investigations (OSINT) Manager

@ Logically | India

View on infosec-jobs.com

Security Engineer I, Offensive Security Penetration Testing

@ Amazon.com | US, NY, Virtual Location - New York

View on infosec-jobs.com