June 29, 2022, 1:20 a.m. | W. Ronny Huang, Steve Chien, Om Thakkar, Rajiv Mathews

cs.CR updates on arXiv.org arxiv.org

End-to-end (E2E) models are often being accompanied by language models (LMs)
via shallow fusion for boosting their overall quality as well as recognition of
rare words. At the same time, several prior works show that LMs are susceptible
to unintentionally memorizing rare or unique sequences in the training data. In
this work, we design a framework for detecting memorization of random textual
sequences (which we call canaries) in the LM training data when one has only
black-box (query) access to …

asr language

Information Security Engineers

@ D. E. Shaw Research | New York City

Information Systems Security Engineer (ISSE)

@ Wyetech | Ft. Belvoir, Virginia

Security Consultant, FedRAMP Assessment | Remote US

@ Coalfire | United States

PAI/OSINT Administration Policy SME

@ Accenture Federal Services | Washington, DC

Field CISO

@ Lacework | United States

Risk Advisory Forensic Technology Services Senior

@ KPMG India | Mumbai, Maharashtra, India