LEARNING INTERPRETABLE FILTERS IN WAV-UNET FOR SPEECH ENHANCEMENT - Laboratoire Traitement et Communication de l'Information Access content directly
Conference Papers Year : 2023

LEARNING INTERPRETABLE FILTERS IN WAV-UNET FOR SPEECH ENHANCEMENT

Félix Mathieu
  • Function : Author
  • PersonId : 1367933
  • IdRef : 276549813
Thomas Courtat
  • Function : Author

Abstract

Due to their performances, deep neural networks have emerged as a major method in nearly all modern audio processing applications. Deep neural networks can be used to estimate some parameters or hyperparameters of a model, or in some cases the entire model in an end-to-end fashion. Although deep learning can lead to state of the art performances, they also suffer from inherent weaknesses as they usually remain complex and non interpretable to a large extent. For instance, the internal filters used in each layers are chosen in an adhoc manner with only a loose relation with the nature of the processed signal. We propose in this paper an approach to learn interpretable filters within a specific neural architecture which allow to better understand the behaviour of the neural network and to reduce its complexity. We validate the approach on a task of speech enhancement and show that the gain in interpretability does not degrade the performance of the model.
Fichier principal
Vignette du fichier
MATHIEU_ICASSP_2023-2.pdf (466.8 Ko) Télécharger le fichier
Origin Files produced by the author(s)

Dates and versions

hal-04048829 , version 1 (28-03-2023)

Identifiers

  • HAL Id : hal-04048829 , version 1

Cite

Félix Mathieu, Thomas Courtat, Gael Richard, Geoffroy Peeters. LEARNING INTERPRETABLE FILTERS IN WAV-UNET FOR SPEECH ENHANCEMENT. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Jun 2023, Rhodes, Greece. ⟨hal-04048829⟩
89 View
347 Download

Share

Gmail Mastodon Facebook X LinkedIn More