Structure-informed Positional Encoding for Music Generation - Département Image, Données, Signal Accéder directement au contenu
Communication Dans Un Congrès Année : 2024

Structure-informed Positional Encoding for Music Generation

Résumé

Music generated by deep learning methods often suffers from a lack of coherence and long-term organization. Yet, multi-scale hierarchical structure is a distinctive feature of music signals. To leverage this information, we propose a structure-informed positional encoding framework for music generation with Transformers. We design three variants in terms of absolute, relative and non-stationary positional information. We comprehensively test them on two symbolic music generation tasks: next-timestep prediction and accompaniment generation. As a comparison, we choose multiple baselines from the literature and demonstrate the merits of our methods using several musically-motivated evaluation metrics. In particular, our methods improve the melodic and structural consistency of the generated pieces.
Fichier principal
Vignette du fichier
svbwdvrdnrztpzxgdsckkhqxkjbjpfzx.pdf (1.16 Mo) Télécharger le fichier
Origine Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-04432659 , version 1 (15-02-2024)
hal-04432659 , version 2 (20-02-2024)
hal-04432659 , version 3 (28-02-2024)

Identifiants

Citer

Manvi Agarwal, Changhong Wang, Gaël Richard. Structure-informed Positional Encoding for Music Generation. IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Apr 2024, Seoul, South Korea. ⟨hal-04432659v3⟩
298 Consultations
137 Téléchargements

Altmetric

Partager

Gmail Mastodon Facebook X LinkedIn More