Dec. 12, 2022, 2:10 a.m. | Dan Qiao, Yu-Xiang Wang

cs.CR updates on arXiv.org arxiv.org

Motivated by personalized healthcare and other applications involving
sensitive data, we study online exploration in reinforcement learning with
differential privacy (DP) constraints. Existing work on this problem
established that no-regret learning is possible under joint differential
privacy (JDP) and local differential privacy (LDP) but did not provide an
algorithm with optimal regret. We close this gap for the JDP case by designing
an $\epsilon$-JDP algorithm with a regret of
$\widetilde{O}(\sqrt{SAH^2T}+S^2AH^3/\epsilon)$ which matches the
information-theoretic lower bound of non-private learning for …

near

Junior Cybersecurity Analyst - 3346195

@ TCG | 725 17th St NW, Washington, DC, USA

Cyber Intelligence, Senior Advisor

@ Peraton | Chantilly, VA, United States

Consultant Cybersécurité H/F - Innovative Tech

@ Devoteam | Marseille, France

Manager, Internal Audit (GIA Cyber)

@ Standard Bank Group | Johannesburg, South Africa

Staff DevSecOps Engineer

@ Raft | San Antonio, TX (Local Remote)

Domain Leader Cybersecurity

@ Alstom | Bengaluru, KA, IN