Query-Based Adversarial Prompt Generation | allinfosecnews.com

Feb. 20, 2024, 5:11 a.m. | Jonathan Hayase, Ema Borevkovic, Nicholas Carlini, Florian Tram\`er, Milad Nasr

cs.CR updates on arXiv.org arxiv.org

arXiv:2402.12329v1 Announce Type: cross
Abstract: Recent work has shown it is possible to construct adversarial examples that cause an aligned language model to emit harmful strings or perform harmful behavior. Existing attacks work either in the white-box setting (with full access to the model weights), or through transferability: the phenomenon that adversarial examples crafted on one model often remain effective on other models. We improve on prior work with a query-based attack that leverages API access to a remote language …

access adversarial arxiv attacks box cs.ai cs.cl cs.cr cs.lg examples language prompt query strings work

More from arxiv.org / cs.CR updates on arXiv.org

Differentially private Bayesian tests 2 days, 2 hours ago | arxiv.org

arxiv confidential cornerstone cs.cr +16

On the Learnability of Watermarks for Language Models 2 days, 2 hours ago | arxiv.org

arxiv ask can cs.cl +12

Intriguing Properties of Diffusion Models: An Empirical Study of the Natural Attack Capability in Text-to-Image … 2 days, 2 hours ago | arxiv.org

applications arxiv attack cs.cr +14

On the Reliability of Watermarks for Large Language Models 2 days, 2 hours ago | arxiv.org

arxiv bots cs.cl cs.cr +23

A Watermark for Large Language Models 2 days, 2 hours ago | arxiv.org

arxiv can cs.cl cs.cr +13

Asymmetric Distributed Trust 2 days, 2 hours ago | arxiv.org

abstraction algorithms arxiv can +12

Read Disturbance in High Bandwidth Memory: A Detailed Experimental Study on HBM2 DRAM Chips 2 days, 2 hours ago | arxiv.org

arxiv bandwidth chips cs.ar +5

ABACuS: All-Bank Activation Counters for Scalable and Low Overhead RowHammer Mitigation 2 days, 2 hours ago | arxiv.org

access address area arxiv +17

A Case Study of Large Language Models (ChatGPT and CodeBERT) for Security-Oriented Code Analysis 2 days, 2 hours ago | arxiv.org

analysis arxiv can capabilities +17

Technical Senior Manager, SecOps | Remote US

@ Coalfire | United States

View on infosec-jobs.com

Global Cybersecurity Governance Analyst

@ UL Solutions | United States

View on infosec-jobs.com

Security Engineer II, AWS Offensive Security

@ Amazon.com | US, WA, Virtual Location - Washington

View on infosec-jobs.com

Senior Cyber Threat Intelligence Analyst

@ Sainsbury's | Coventry, West Midlands, United Kingdom

View on infosec-jobs.com

Embedded Global Intelligence and Threat Monitoring Analyst

@ Sibylline Ltd | Austin, Texas, United States

View on infosec-jobs.com

Senior Security Engineer

@ Curai Health | Remote

View on infosec-jobs.com