all InfoSec news
Evaluating Superhuman Models with Consistency Checks. (arXiv:2306.09983v1 [cs.LG])
cs.CR updates on arXiv.org arxiv.org
If machine learning models were to achieve superhuman abilities at various
reasoning or decision-making tasks, how would we go about evaluating such
models, given that humans would necessarily be poor proxies for ground truth?
In this paper, we propose a framework for evaluating superhuman models via
consistency checks. Our premise is that while the correctness of superhuman
decisions may be impossible to evaluate, we can still surface mistakes if the
model's decisions fail to satisfy certain logical, human-interpretable rules.
We …
decision framework humans machine machine learning machine learning models making poor premise proxies reasoning truth