all InfoSec news
Calibration Attack: A Framework For Adversarial Attacks Targeting Calibration. (arXiv:2401.02718v1 [cs.LG])
cs.CR updates on arXiv.org arxiv.org
We introduce a new framework of adversarial attacks, named calibration
attacks, in which the attacks are generated and organized to trap victim models
to be miscalibrated without altering their original accuracy, hence seriously
endangering the trustworthiness of the models and any decision-making based on
their confidence scores. Specifically, we identify four novel forms of
calibration attacks: underconfidence attacks, overconfidence attacks, maximum
miscalibration attacks, and random confidence attacks, in both the black-box
and white-box setups. We then test these new attacks …
accuracy adversarial adversarial attacks attack attacks decision framework generated making targeting trustworthiness victim