all InfoSec news
LLM-based Privacy Data Augmentation Guided by Knowledge Distillation with a Distribution Tutor for Medical Text Classification
Feb. 27, 2024, 5:11 a.m. | Yiping Song, Juhua Zhang, Zhiliang Tian, Yuxin Yang, Minlie Huang, Dongsheng Li
cs.CR updates on arXiv.org arxiv.org
Abstract: As sufficient data are not always publically accessible for model training, researchers exploit limited data with advanced learning algorithms or expand the dataset via data augmentation (DA). Conducting DA in private domain requires private protection approaches (i.e. anonymization and perturbation), but those methods cannot provide protection guarantees. Differential privacy (DP) learning methods theoretically bound the protection but are not skilled at generating pseudo text samples with large models. In this paper, we transfer DP-based pseudo …
advanced algorithms anonymization arxiv augmentation classification cs.cl cs.cr data dataset distribution domain exploit knowledge llm medical model training privacy private protection researchers text training
More from arxiv.org / cs.CR updates on arXiv.org
Jobs in InfoSec / Cybersecurity
Social Engineer For Reverse Engineering Exploit Study
@ Independent study | Remote
Information Security Specialist, Sr. (Container Hardening)
@ Rackner | San Antonio, TX
Principal Security Researcher (Advanced Threat Prevention)
@ Palo Alto Networks | Santa Clara, CA, United States
EWT Infosec | IAM Technical Security Consultant - Manager
@ KPMG India | Bengaluru, Karnataka, India
Security Engineering Operations Manager
@ Gusto | San Francisco, CA; Denver, CO; Remote
Network Threat Detection Engineer
@ Meta | Denver, CO | Reston, VA | Menlo Park, CA | Washington, DC