all InfoSec news
Attacking LLM Watermarks by Exploiting Their Strengths
Feb. 27, 2024, 5:11 a.m. | Qi Pang, Shengyuan Hu, Wenting Zheng, Virginia Smith
cs.CR updates on arXiv.org arxiv.org
Abstract: Advances in generative models have made it possible for AI-generated text, code, and images to mirror human-generated content in many applications. Watermarking, a technique that aims to embed information in the output of a model to verify its source, is useful for mitigating misuse of such AI-generated content. However, existing watermarking schemes remain surprisingly susceptible to attack. In particular, we show that desirable properties shared by existing LLM watermarking systems such as quality preservation, robustness, …
applications arxiv code cs.cl cs.cr cs.lg exploiting generated generative generative models human images information llm mirror text verify watermarking watermarks
More from arxiv.org / cs.CR updates on arXiv.org
IDEA: Invariant Defense for Graph Adversarial Robustness
2 days, 2 hours ago |
arxiv.org
FairCMS: Cloud Media Sharing with Fair Copyright Protection
2 days, 2 hours ago |
arxiv.org
Efficient unitary designs and pseudorandom unitaries from permutations
2 days, 2 hours ago |
arxiv.org
Jobs in InfoSec / Cybersecurity
SOC 2 Manager, Audit and Certification
@ Deloitte | US and CA Multiple Locations
Associate Compliance Advisor
@ SAP | Budapest, HU, 1031
DevSecOps Engineer
@ Qube Research & Technologies | London
Software Engineer, Security
@ Render | San Francisco, CA or Remote (USA & Canada)
Associate Consultant
@ Control Risks | Frankfurt, Hessen, Germany
Senior Security Engineer
@ Activision Blizzard | Work from Home - CA