Feb. 27, 2024, 5:11 a.m. | Yihong Dong, Xue Jiang, Huanyu Liu, Zhi Jin, Ge Li

cs.CR updates on arXiv.org arxiv.org

arXiv:2402.15938v1 Announce Type: cross
Abstract: Recent statements about the impressive capabilities of large language models (LLMs) are usually supported by evaluating on open-access benchmarks. Considering the vast size and wide-ranging sources of LLMs' training data, it could explicitly or implicitly include test data, leading to LLMs being more susceptible to data contamination. However, due to the opacity of training data, the black-box access of models, and the rapid growth of synthetic training data, detecting and mitigating data contamination for LLMs …

access arxiv benchmarks capabilities cs.ai cs.cl cs.cr cs.lg cs.se data evaluation language language models large llms size test training training data vast

Financial Crimes Compliance - Senior - Consulting - Location Open

@ EY | New York City, US, 10001-8604

Software Engineer - Cloud Security

@ Neo4j | Malmö

Security Consultant

@ LRQA | Singapore, Singapore, SG, 119963

Identity Governance Consultant

@ Allianz | Sydney, NSW, AU, 2000

Educator, Cybersecurity

@ Brain Station | Toronto

Principal Security Engineer

@ Hippocratic AI | Palo Alto