all InfoSec news
HM-Conformer: A Conformer-based audio deepfake detection system with hierarchical pooling and multi-level classification token aggregation methods. (arXiv:2309.08208v1 [cs.SD])
cs.CR updates on arXiv.org arxiv.org
Audio deepfake detection (ADD) is the task of detecting spoofing attacks
generated by text-to-speech or voice conversion systems. Spoofing evidence,
which helps to distinguish between spoofed and bona-fide utterances, might
exist either locally or globally in the input features. To capture these, the
Conformer, which consists of Transformers and CNN, possesses a suitable
structure. However, since the Conformer was designed for sequence-to-sequence
tasks, its direct application to ADD tasks may be sub-optimal. To tackle this
limitation, we propose HM-Conformer by …
aggregation attacks audio audio deepfake capture classification conversion deepfake deepfake detection detection features generated input locally speech spoofed spoofing spoofing attacks system systems task text token voice