Web: http://arxiv.org/abs/2204.12622

April 28, 2022, 1:20 a.m. | Guillaume Baril, Patrick Cardinal, Alessandro Lameiras Koerich

cs.CR updates on arXiv.org arxiv.org

Data anonymization is often a task carried out by humans. Automating it would
reduce the cost and time required to complete this task. This paper presents a
pipeline to automate the anonymization of audio data in French. We propose a
pipeline, which takes audio files with their transcriptions and removes the
named entities (NEs) present in the audio. Our pipeline is made up of a forced
aligner, which aligns words in an audio transcript with speech and a model that …


