June 19, 2023, 1:10 a.m. | Lukas Fluri, Daniel Paleka, Florian Tramèr

cs.CR updates on arXiv.org arxiv.org

If machine learning models were to achieve superhuman abilities at various
reasoning or decision-making tasks, how would we go about evaluating such
models, given that humans would necessarily be poor proxies for ground truth?
In this paper, we propose a framework for evaluating superhuman models via
consistency checks. Our premise is that while the correctness of superhuman
decisions may be impossible to evaluate, we can still surface mistakes if the
model's decisions fail to satisfy certain logical, human-interpretable rules.
We …

decision framework humans machine machine learning machine learning models making poor premise proxies reasoning truth

SOC 2 Manager, Audit and Certification

@ Deloitte | US and CA Multiple Locations

Associate Principal Security Engineer

@ Activision Blizzard | Work from Home - CA

Security Engineer- Systems Integration

@ Meta | Bellevue, WA | Menlo Park, CA | New York City

Lead Security Engineer (Digital Forensic and IR Analyst)

@ Blue Yonder | Hyderabad

Senior Principal IAM Engineering Program Manager Cybersecurity

@ Providence | Redmond, WA, United States

Information Security Analyst II or III

@ Entergy | The Woodlands, Texas, United States