all InfoSec news
OpenAI’s flagship AI model has gotten more trustworthy but easier to trick
The Verge - All Posts www.theverge.com
OpenAI’s GPT-4 large language model may be more trustworthy than GPT-3.5 but also more vulnerable to jailbreaking and bias, according to research backed by Microsoft.
The paper — by researchers from the University of Illinois Urbana-Champaign, Stanford University, University of California, Berkeley, Center for AI Safety, and Microsoft Research — gave GPT-4 a higher trustworthiness score than its predecessor. That means they found it was generally better at protecting private information, avoiding toxic results like biased information, and …
ai model ai safety bias california center easier gpt gpt-3 gpt-3.5 gpt-4 illinois image jailbreaking language large large language model may microsoft openai research researchers safety stanford stanford university university university of california vulnerable