all InfoSec news
Watermarking Makes Language Models Radioactive
Feb. 26, 2024, 5:11 a.m. | Tom Sander, Pierre Fernandez, Alain Durmus, Matthijs Douze, Teddy Furon
cs.CR updates on arXiv.org arxiv.org
Abstract: This paper investigates the radioactivity of LLM-generated texts, i.e. whether it is possible to detect that such input was used as training data. Conventional methods like membership inference can carry out this detection with some level of accuracy. We show that watermarked training data leaves traces easier to detect and much more reliable than membership inference. We link the contamination level to the watermark robustness, its proportion in the training set, and the fine-tuning process. …
accuracy arxiv can cs.ai cs.cl cs.cr cs.lg data detect detection easier generated input language language models llm texts traces training training data watermarking
More from arxiv.org / cs.CR updates on arXiv.org
IDEA: Invariant Defense for Graph Adversarial Robustness
1 day, 15 hours ago |
arxiv.org
FairCMS: Cloud Media Sharing with Fair Copyright Protection
1 day, 15 hours ago |
arxiv.org
Efficient unitary designs and pseudorandom unitaries from permutations
1 day, 15 hours ago |
arxiv.org
Jobs in InfoSec / Cybersecurity
SOC 2 Manager, Audit and Certification
@ Deloitte | US and CA Multiple Locations
Cybersecurity Engineer
@ Booz Allen Hamilton | USA, VA, Arlington (1550 Crystal Dr Suite 300) non-client
Invoice Compliance Reviewer
@ AC Disaster Consulting | Fort Myers, Florida, United States - Remote
Technical Program Manager II - Compliance
@ Microsoft | Redmond, Washington, United States
Head of U.S. Threat Intelligence / Senior Manager for Threat Intelligence
@ Moonshot | Washington, District of Columbia, United States
Customer Engineer, Security, Public Sector
@ Google | Virginia, USA; Illinois, USA