April 4, 2023, 1:10 a.m. | Homa Esfahanizadeh, Adam Yala, Rafael G. L. D'Oliveira, Andrea J. D. Jaba, Victor Quach, Ken R. Duffy, Tommi S. Jaakkola, Vinod Vaikuntanathan, M

cs.CR updates on arXiv.org arxiv.org

Allowing organizations to share their data for training of machine learning
(ML) models without unintended information leakage is an open problem in
practice. A promising technique for this still-open problem is to train models
on the encoded data. Our approach, called Privately Encoded Open Datasets with
Public Labels (PEOPL), uses a certain class of randomly constructed transforms
to encode sensitive data. Organizations publish their randomly encoded data and
associated raw labels for ML training, where training is done without knowledge …

called class data datasets encoding important information information leakage knowledge machine machine learning organizations practice problem public sensitive data share train training

SOC 2 Manager, Audit and Certification

@ Deloitte | US and CA Multiple Locations

Information Security Engineers

@ D. E. Shaw Research | New York City

Junior Cybersecurity Triage Analyst

@ Peraton | Linthicum, MD, United States

Associate Director, Operations Compliance and Investigations Management

@ Legend Biotech | Raritan, New Jersey, United States

Analyst, Cyber Operations Engineer

@ BlackRock | SN6-Singapore - 20 Anson Road

Working Student/Intern/Thesis: Hardware based Cybersecurity Training (m/f/d)

@ AVL | Regensburg, DE