Feb. 7, 2024, 5:10 a.m. | Da Yu Sivakanth Gopi Janardhan Kulkarni Zinan Lin Saurabh Naik Tomasz Lukasz Religa Jian Yin H

cs.CR updates on arXiv.org arxiv.org

Suppose we want to train text prediction models in email clients or word processors. These models, which serve billions of predictions per hour, must preserve the privacy of user data and adhere to specific model size constraints to meet memory, inference time requirements, and to reduce inference cost. Building small, fast, and private domain-specific language models is a thriving area of research. In this work, we show that a careful pre-training on a {\em subset} of the public dataset that …

building clients constraints cost cs.cr cs.lg data domain email fast fine-tuning language memory prediction predictions privacy private processors requirements size text train training user data word

IT Security Engineer

@ Timocom GmbH | Erkrath, Germany

Consultant SOC / CERT H/F

@ Hifield | Sèvres, France

Privacy Engineer, Implementation Review

@ Meta | Menlo Park, CA | Seattle, WA

Cybersecurity Specialist (Security Engineering)

@ Triton AI Pte Ltd | Singapore, Singapore, Singapore

SOC Analyst

@ Rubrik | Palo Alto

Consultant Tech Advisory H/F

@ Hifield | Sèvres, France