June 27, 2024, 4:19 a.m. | Fredrik Nestaas, Edoardo Debenedetti, Florian Tram\`er

cs.CR updates on arXiv.org arxiv.org

arXiv:2406.18382v1 Announce Type: new
Abstract: Large Language Models (LLMs) are increasingly used in applications where the model selects from competing third-party content, such as in LLM-powered search engines or chatbot plugins. In this paper, we introduce Preference Manipulation Attacks, a new class of attacks that manipulate an LLM's selections to favor the attacker. We demonstrate that carefully crafted website content or plugin documentations can trick an LLM to promote the attacker products and discredit competitors, thereby increasing user traffic and …

adversarial applications arxiv attacks chatbot class cs.cr cs.lg engine language language models large llm llms manipulation optimization party plugins search search engine search engines third third-party

Senior Systems Engineer - AWS

@ CACI International Inc | 999 REMOTE

Managing Consultant / Consulting Director / Engagement Lead in Cybersecurity Consulting

@ Marsh McLennan | Toronto - Bremner

Specialist , Fraud Investigation and SecOps

@ Concentrix | Bulgaria - Work at Home

Data Engineer, Mid

@ Booz Allen Hamilton | USA, CA, San Diego (1615 Murray Canyon Rd)

Manager, Risk Management

@ Manulife | CAN, Ontario, Toronto, 200 Bloor Street East

Regional Channel Manager (Remote - West)

@ Dell Technologies | Remote - California, United States (All Other)