all InfoSec news
PEOPL: Characterizing Privately Encoded Open Datasets with Public Labels. (arXiv:2304.00047v1 [cs.LG])
cs.CR updates on arXiv.org arxiv.org
Allowing organizations to share their data for training of machine learning
(ML) models without unintended information leakage is an open problem in
practice. A promising technique for this still-open problem is to train models
on the encoded data. Our approach, called Privately Encoded Open Datasets with
Public Labels (PEOPL), uses a certain class of randomly constructed transforms
to encode sensitive data. Organizations publish their randomly encoded data and
associated raw labels for ML training, where training is done without knowledge …
called class data datasets encoding important information information leakage knowledge machine machine learning organizations practice problem public sensitive data share train training