all InfoSec news
BAFFLE: Hiding Backdoors in Offline Reinforcement Learning Datasets
March 21, 2024, 4:11 a.m. | Chen Gong, Zhou Yang, Yunpeng Bai, Junda He, Jieke Shi, Kecen Li, Arunesh Sinha, Bowen Xu, Xinwen Hou, David Lo, Tianhao Wang
cs.CR updates on arXiv.org arxiv.org
Abstract: Reinforcement learning (RL) makes an agent learn from trial-and-error experiences gathered during the interaction with the environment. Recently, offline RL has become a popular RL paradigm because it saves the interactions with environments. In offline RL, data providers share large pre-collected datasets, and others can train high-quality agents without interacting with the environments. This paradigm has demonstrated effectiveness in critical tasks like robot control, autonomous driving, etc. However, less attention is paid to investigating the …
agent arxiv backdoors baffle can cs.ai cs.cr cs.lg data datasets environment environments error experiences high large learn offline paradigm popular quality share train trial
More from arxiv.org / cs.CR updates on arXiv.org
Jobs in InfoSec / Cybersecurity
Senior Security Specialist, Forsah Technical and Vocational Education and Training (Forsah TVET) (NEW)
@ IREX | Ramallah, West Bank, Palestinian National Authority
Consultant(e) Junior Cybersécurité
@ Sia Partners | Paris, France
Senior Network Security Engineer
@ NielsenIQ | Mexico City, Mexico
Senior Consultant, Payment Intelligence
@ Visa | Washington, DC, United States
Corporate Counsel, Compliance
@ Okta | San Francisco, CA; Bellevue, WA; Chicago, IL; New York City; Washington, DC; Austin, TX
Security Operations Engineer
@ Samsara | Remote - US