June 18, 2024, 4:19 a.m. | Shangqing Tu, Zhuoran Pan, Wenxuan Wang, Zhexin Zhang, Yuliang Sun, Jifan Yu, Hongning Wang, Lei Hou, Juanzi Li

cs.CR updates on arXiv.org arxiv.org

arXiv:2406.11682v1 Announce Type: cross
Abstract: Large language models (LLMs) have been increasingly applied to various domains, which triggers increasing concerns about LLMs' safety on specialized domains, e.g. medicine. However, testing the domain-specific safety of LLMs is challenging due to the lack of domain knowledge-driven attacks in existing benchmarks. To bridge this gap, we propose a new task, knowledge-to-jailbreak, which aims to generate jailbreaks from domain knowledge to evaluate the safety of LLMs when applied to those domains. We collect a …

arxiv attack cs.ai cs.cl cs.cr jailbreak knowledge point

Information Technology Specialist I: Windows Engineer

@ Los Angeles County Employees Retirement Association (LACERA) | Pasadena, California

Information Technology Specialist I, LACERA: Information Security Engineer

@ Los Angeles County Employees Retirement Association (LACERA) | Pasadena, CA

Vice President, Controls Design & Development-7

@ State Street | Quincy, Massachusetts

Vice President, Controls Design & Development-5

@ State Street | Quincy, Massachusetts

Data Scientist & AI Prompt Engineer

@ Varonis | Israel

Contractor

@ Birlasoft | INDIA - MUMBAI - BIRLASOFT OFFICE, IN