all InfoSec news
Stealing Part of a Production Language Model
March 12, 2024, 4:11 a.m. | Nicholas Carlini, Daniel Paleka, Krishnamurthy Dj Dvijotham, Thomas Steinke, Jonathan Hayase, A. Feder Cooper, Katherine Lee, Matthew Jagielski, Milad
cs.CR updates on arXiv.org arxiv.org
Abstract: We introduce the first model-stealing attack that extracts precise, nontrivial information from black-box production language models like OpenAI's ChatGPT or Google's PaLM-2. Specifically, our attack recovers the embedding projection layer (up to symmetries) of a transformer model, given typical API access. For under \$20 USD, our attack extracts the entire projection matrix of OpenAI's Ada and Babbage language models. We thereby confirm, for the first time, that these black-box models have a hidden dimension of …
access api arxiv attack box chatgpt cs.cr google information language language models openai palm production stealing under
More from arxiv.org / cs.CR updates on arXiv.org
Jobs in InfoSec / Cybersecurity
Social Engineer For Reverse Engineering Exploit Study
@ Independent study | Remote
Information Security Specialist, Sr. (Container Hardening)
@ Rackner | San Antonio, TX
Principal Security Researcher (Advanced Threat Prevention)
@ Palo Alto Networks | Santa Clara, CA, United States
EWT Infosec | IAM Technical Security Consultant - Manager
@ KPMG India | Bengaluru, Karnataka, India
Security Engineering Operations Manager
@ Gusto | San Francisco, CA; Denver, CO; Remote
Network Threat Detection Engineer
@ Meta | Denver, CO | Reston, VA | Menlo Park, CA | Washington, DC