LEARNING INTERPRETABLE FILTERS IN WAV-UNET FOR SPEECH ENHANCEMENT

Félix Mathieu; Thomas Courtat; Gael Richard; Geoffroy Peeters

Communication Dans Un Congrès Année : 2023

LEARNING INTERPRETABLE FILTERS IN WAV-UNET FOR SPEECH ENHANCEMENT

, , (1, 2) , (1, 2)

1
2

Félix Mathieu

Fonction : Auteur
PersonId : 1367933
IdRef : 276549813

Thomas Courtat

Fonction : Auteur

Gael Richard

Fonction : Auteur
PersonId : 14146
IdHAL : gael-richard
IdRef : 094977208

Signal, Statistique et Apprentissage

Département Images, Données, Signal

Geoffroy Peeters

Fonction : Auteur
PersonId : 6738
IdHAL : geoffroy-peeters
ORCID : 0000-0001-5255-3019
IdRef : 187470472

Signal, Statistique et Apprentissage

Département Images, Données, Signal

Résumé

Due to their performances, deep neural networks have emerged as a major method in nearly all modern audio processing applications. Deep neural networks can be used to estimate some parameters or hyperparameters of a model, or in some cases the entire model in an end-to-end fashion. Although deep learning can lead to state of the art performances, they also suffer from inherent weaknesses as they usually remain complex and non interpretable to a large extent. For instance, the internal filters used in each layers are chosen in an adhoc manner with only a loose relation with the nature of the processed signal. We propose in this paper an approach to learn interpretable filters within a specific neural architecture which allow to better understand the behaviour of the neural network and to reduce its complexity. We validate the approach on a task of speech enhancement and show that the gain in interpretability does not degrade the performance of the model.

Mots clés

Representation learning interpretability speech enhancement

Domaines

Traitement du signal et de l'image [eess.SP]

Fichier principal

MATHIEU_ICASSP_2023-2.pdf (466.8 Ko)

Origine : Fichiers produits par l'(les) auteur(s)

Gaël RICHARD : Connectez-vous pour contacter le contributeur

https://telecom-paris.hal.science/hal-04048829

Soumis le : mardi 28 mars 2023-14:20:58

Dernière modification le : mardi 26 mars 2024-12:59:44

Archivage à long terme le : jeudi 29 juin 2023-19:06:07

Dates et versions

hal-04048829 , version 1 (28-03-2023)

Identifiants

HAL Id : hal-04048829 , version 1

Citer

Félix Mathieu, Thomas Courtat, Gael Richard, Geoffroy Peeters. LEARNING INTERPRETABLE FILTERS IN WAV-UNET FOR SPEECH ENHANCEMENT. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Jun 2023, Rhodes, Greece. ⟨hal-04048829⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

INSTITUT-TELECOM LTCI IDS S2A IP_PARIS

80 Consultations

301 Téléchargements

LEARNING INTERPRETABLE FILTERS IN WAV-UNET FOR SPEECH ENHANCEMENT

Résumé

Mots clés

Domaines

Dates et versions

Identifiants

Citer

Exporter

Collections

Partager