all InfoSec news
Adaptive Discounting of Training Time Attacks. (arXiv:2401.02652v1 [cs.LG])
cs.CR updates on arXiv.org arxiv.org
Among the most insidious attacks on Reinforcement Learning (RL) solutions are
training-time attacks (TTAs) that create loopholes and backdoors in the learned
behaviour. Not limited to a simple disruption, constructive TTAs (C-TTAs) are
now available, where the attacker forces a specific, target behaviour upon a
training RL agent (victim). However, even state-of-the-art C-TTAs focus on
target behaviours that could be naturally adopted by the victim if not for a
particular feature of the environment dynamics, which C-TTAs exploit. In this …
agent art attacker attacks backdoors disruption loopholes simple solutions state target training victim