BAFFLE: Hiding Backdoors in Offline Reinforcement Learning Datasets | allinfosecnews.com

March 21, 2024, 4:11 a.m. | Chen Gong, Zhou Yang, Yunpeng Bai, Junda He, Jieke Shi, Kecen Li, Arunesh Sinha, Bowen Xu, Xinwen Hou, David Lo, Tianhao Wang

cs.CR updates on arXiv.org arxiv.org

arXiv:2210.04688v5 Announce Type: replace-cross
Abstract: Reinforcement learning (RL) makes an agent learn from trial-and-error experiences gathered during the interaction with the environment. Recently, offline RL has become a popular RL paradigm because it saves the interactions with environments. In offline RL, data providers share large pre-collected datasets, and others can train high-quality agents without interacting with the environments. This paradigm has demonstrated effectiveness in critical tasks like robot control, autonomous driving, etc. However, less attention is paid to investigating the …

agent arxiv backdoors baffle can cs.ai cs.cr cs.lg data datasets environment environments error experiences high large learn offline paradigm popular quality share train trial

More from arxiv.org / cs.CR updates on arXiv.org

Differentially private Bayesian tests 18 hours ago | arxiv.org

arxiv confidential cornerstone cs.cr +16

On the Learnability of Watermarks for Language Models 18 hours ago | arxiv.org

arxiv ask can cs.cl +12

Intriguing Properties of Diffusion Models: An Empirical Study of the Natural Attack Capability in Text-to-Image … 18 hours ago | arxiv.org

applications arxiv attack cs.cr +14

On the Reliability of Watermarks for Large Language Models 18 hours ago | arxiv.org

arxiv bots cs.cl cs.cr +23

A Watermark for Large Language Models 18 hours ago | arxiv.org

arxiv can cs.cl cs.cr +13

Asymmetric Distributed Trust 18 hours ago | arxiv.org

abstraction algorithms arxiv can +12

Read Disturbance in High Bandwidth Memory: A Detailed Experimental Study on HBM2 DRAM Chips 18 hours ago | arxiv.org

arxiv bandwidth chips cs.ar +5

ABACuS: All-Bank Activation Counters for Scalable and Low Overhead RowHammer Mitigation 18 hours ago | arxiv.org

access address area arxiv +17

A Case Study of Large Language Models (ChatGPT and CodeBERT) for Security-Oriented Code Analysis 18 hours ago | arxiv.org

analysis arxiv can capabilities +17

Senior Security Specialist, Forsah Technical and Vocational Education and Training (Forsah TVET) (NEW)

@ IREX | Ramallah, West Bank, Palestinian National Authority

View on infosec-jobs.com

Consultant(e) Junior Cybersécurité

@ Sia Partners | Paris, France

View on infosec-jobs.com

Senior Network Security Engineer

@ NielsenIQ | Mexico City, Mexico

View on infosec-jobs.com

Senior Consultant, Payment Intelligence

@ Visa | Washington, DC, United States

View on infosec-jobs.com

Corporate Counsel, Compliance

@ Okta | San Francisco, CA; Bellevue, WA; Chicago, IL; New York City; Washington, DC; Austin, TX

View on infosec-jobs.com

Security Operations Engineer

@ Samsara | Remote - US

View on infosec-jobs.com