March 22, 2023, 1:10 a.m. | Ruixiang Tang, Qizhang Feng, Ninghao Liu, Fan Yang, Xia Hu

cs.CR updates on arXiv.org arxiv.org

The huge supporting training data on the Internet has been a key factor in
the success of deep learning models. However, this abundance of
public-available data also raises concerns about the unauthorized exploitation
of datasets for commercial purposes, which is forbidden by dataset licenses. In
this paper, we propose a backdoor-based watermarking approach that serves as a
general framework for safeguarding public-available data. By inserting a small
number of watermarking samples into the dataset, our approach enables the
learning model …

backdoor commercial data datasets deep learning exploitation factor forbidden framework general internet key licenses protection public train training watermarking

SOC 2 Manager, Audit and Certification

@ Deloitte | US and CA Multiple Locations

Information Security Engineers

@ D. E. Shaw Research | New York City

Deputy Chief Information Security Officer

@ City of Philadelphia | Philadelphia, PA, United States

Global Cybersecurity Expert

@ CMA CGM | Mumbai, IN

Senior Security Operations Engineer

@ EarnIn | Mexico

Cyber Technologist (Sales Engineer)

@ Darktrace | London