all InfoSec news
Generalization or Memorization: Data Contamination and Trustworthy Evaluation for Large Language Models
Feb. 27, 2024, 5:11 a.m. | Yihong Dong, Xue Jiang, Huanyu Liu, Zhi Jin, Ge Li
cs.CR updates on arXiv.org arxiv.org
Abstract: Recent statements about the impressive capabilities of large language models (LLMs) are usually supported by evaluating on open-access benchmarks. Considering the vast size and wide-ranging sources of LLMs' training data, it could explicitly or implicitly include test data, leading to LLMs being more susceptible to data contamination. However, due to the opacity of training data, the black-box access of models, and the rapid growth of synthetic training data, detecting and mitigating data contamination for LLMs …
access arxiv benchmarks capabilities cs.ai cs.cl cs.cr cs.lg cs.se data evaluation language language models large llms size test training training data vast
More from arxiv.org / cs.CR updates on arXiv.org
Jobs in InfoSec / Cybersecurity
Information Security Engineers
@ D. E. Shaw Research | New York City
Technology Security Analyst
@ Halton Region | Oakville, Ontario, Canada
Senior Cyber Security Analyst
@ Valley Water | San Jose, CA
COMM Penetration Tester (PenTest-2), Chantilly, VA OS&CI Job #368
@ Allen Integrated Solutions | Chantilly, Virginia, United States
Consultant Sécurité SI H/F Gouvernance - Risques - Conformité
@ Hifield | Sèvres, France
Infrastructure Consultant
@ Telefonica Tech | Belfast, United Kingdom