all InfoSec news
Provably Confidential Language Modelling. (arXiv:2205.01863v2 [cs.CL] UPDATED)
June 27, 2022, 1:20 a.m. | Xuandong Zhao, Lei Li, Yu-Xiang Wang
cs.CR updates on arXiv.org arxiv.org
Large language models are shown to memorize privacy information such as
social security numbers in training data. Given the sheer scale of the training
corpus, it is challenging to screen and filter these privacy data, either
manually or automatically. In this paper, we propose Confidentially Redacted
Training (CRT), a method to train language generation models while protecting
the confidential segments. We borrow ideas from differential privacy (which
solves a related but distinct problem) and show that our method is able …
More from arxiv.org / cs.CR updates on arXiv.org
Jobs in InfoSec / Cybersecurity
SOC 2 Manager, Audit and Certification
@ Deloitte | US and CA Multiple Locations
Information Security Consultant
@ Auckland Council | Central Auckland, NZ, 1010
Security Engineer, Threat Detection
@ Stripe | Remote, US
DevSecOps Engineer (Remote in Europe)
@ CloudTalk | Prague, Prague, Czechia - Remote
Security Architect
@ Valeo Foods | Dublin, Ireland
Security Specialist - IoT & OT
@ Wallbox | Barcelona, Catalonia, Spain