all InfoSec news
Generalization or Memorization: Data Contamination and Trustworthy Evaluation for Large Language Models
Feb. 27, 2024, 5:11 a.m. | Yihong Dong, Xue Jiang, Huanyu Liu, Zhi Jin, Ge Li
cs.CR updates on arXiv.org arxiv.org
Abstract: Recent statements about the impressive capabilities of large language models (LLMs) are usually supported by evaluating on open-access benchmarks. Considering the vast size and wide-ranging sources of LLMs' training data, it could explicitly or implicitly include test data, leading to LLMs being more susceptible to data contamination. However, due to the opacity of training data, the black-box access of models, and the rapid growth of synthetic training data, detecting and mitigating data contamination for LLMs …
access arxiv benchmarks capabilities cs.ai cs.cl cs.cr cs.lg cs.se data evaluation language language models large llms size test training training data vast
More from arxiv.org / cs.CR updates on arXiv.org
Jobs in InfoSec / Cybersecurity
Financial Crimes Compliance - Senior - Consulting - Location Open
@ EY | New York City, US, 10001-8604
Software Engineer - Cloud Security
@ Neo4j | Malmö
Security Consultant
@ LRQA | Singapore, Singapore, SG, 119963
Identity Governance Consultant
@ Allianz | Sydney, NSW, AU, 2000
Educator, Cybersecurity
@ Brain Station | Toronto
Principal Security Engineer
@ Hippocratic AI | Palo Alto