Feb. 13, 2024, 5:11 a.m. | Rahul Sharma Sergey Redyuk Sumantrak Mukherjee Andrea Sipka Sebastian Vollmer David Selby

cs.CR updates on arXiv.org arxiv.org

Explainable AI (XAI) and interpretable machine learning methods help to build trust in model predictions and derived insights, yet also present a perverse incentive for analysts to manipulate XAI metrics to support pre-specified conclusions. This paper introduces the concept of X-hacking, a form of p-hacking applied to XAI metrics such as Shap values. We show how an automated machine learning pipeline can be used to search for 'defensible' models that produce a desired explanation while maintaining superior predictive performance to …

analysts automl build concept cs.cr cs.lg explainable ai hacking insights machine machine learning metrics predictions support threat trust xai

Director of the Air Force Cyber Technical Center of Excellence (CyTCoE)

@ Air Force Institute of Technology | Dayton, OH, USA

Senior Cyber Security Analyst

@ Valley Water | San Jose, CA

Business Information Security Officer

@ PwC | Auckland - PwC Tower

CI/CD DevSecOps Developer (Remote)

@ NTT DATA | Halifax, NS, CA

Security Operations Engineer

@ Collectors | Santa Ana, California, United States

Security Engineer

@ Wizeline | Colombia