all InfoSec news
Did You Train on My Dataset? Towards Public Dataset Protection with Clean-Label Backdoor Watermarking. (arXiv:2303.11470v1 [cs.CR])
cs.CR updates on arXiv.org arxiv.org
The huge supporting training data on the Internet has been a key factor in
the success of deep learning models. However, this abundance of
public-available data also raises concerns about the unauthorized exploitation
of datasets for commercial purposes, which is forbidden by dataset licenses. In
this paper, we propose a backdoor-based watermarking approach that serves as a
general framework for safeguarding public-available data. By inserting a small
number of watermarking samples into the dataset, our approach enables the
learning model …
backdoor commercial data datasets deep learning exploitation factor forbidden framework general internet key licenses protection public train training watermarking