all InfoSec news
Tensor Trust: Interpretable Prompt Injection Attacks from an Online Game. (arXiv:2311.01011v1 [cs.LG])
cs.CR updates on arXiv.org arxiv.org
While Large Language Models (LLMs) are increasingly being used in real-world
applications, they remain vulnerable to prompt injection attacks: malicious
third party prompts that subvert the intent of the system designer. To help
researchers study this problem, we present a dataset of over 126,000 prompt
injection attacks and 46,000 prompt-based "defenses" against prompt injection,
all created by players of an online game called Tensor Trust. To the best of
our knowledge, this is currently the largest dataset of human-generated
adversarial …
applications attacks dataset designer game injection injection attacks intent language language models large llms malicious party problem prompt prompt injection prompt injection attacks prompts real researchers study system tensor third trust vulnerable world