April 4, 2024, 5:21 p.m. |

GovInfoSecurity.com RSS Syndication www.govinfosecurity.com

'Fictitious Dialogue' About Harmful Content Subverts Defenses, Researchers Find
After testing safety features built into generative artificial intelligence tools developed by the likes of Anthropic, OpenAI and Google DeepMind, researchers have discovered that a technique called "many-shot jailbreaking" can be used to defeat safety guardrails and obtain prohibited content.

ai security anthropic artificial artificial intelligence called can defenses features gen gen ai generative generative artificial intelligence google google deepmind guardrails intelligence jailbreaking openai researchers safety security testing tools

CyberSOC Technical Lead

@ Integrity360 | Sandyford, Dublin, Ireland

Cyber Security Strategy Consultant

@ Capco | New York City

Cyber Security Senior Consultant

@ Capco | Chicago, IL

Sr. Product Manager

@ MixMode | Remote, US

Corporate Intern - Information Security (Year Round)

@ Associated Bank | US WI Remote

Senior Offensive Security Engineer

@ CoStar Group | US-DC Washington, DC