This repository contains the code and data associated with the paper "Exploring the Inner Mechanisms of Large Generative Music Models" In this paper, we focus on MusicGen, a state-of-the-art generative music model, and explore how existing interpretability techniques from the text domain can be transferred to the music domain. Our research provides insights into how MusicGen constructs human-interpretable musicological concepts. Read the Paper, Read the supplemental material
Key contributions include:
- We demonstrate the application of DecoderLens, a technique that aims to provide insights into how MusicGen composes musical concepts over time.
- We employ interchange interventions to observe how individual components of the model contribute to the generation of specific instruments and genres.
- We identify several limitations of these techniques when applied to music, highlighting the need for adaptations tailored to the complexity of audio data.
audio_samples/
: contains audio samples of the DecoderLens experiments, Interchange intervention experiments, and the original unintervened audio samples, as displayed on our GitHub Pages: https://marcel-velez.github.io/musicgen-mech-interp/decoder_lens_outputs/
: directory where it will save the DecoderLens outputs.decoderlens/
: the codebase for the DecoderLens experiments.interchange_interventions/
: the codebase for the Interchange intervention experiments.mean_activations/
: directory we put the mean activations of the interchange intervention experiments, automatically saves mean activations when running the intervention pipeline, so you do not have to recompute every time.intervention_main.py
: the main file to run the interchange intervention experiments, see "usage of code - running interchange interventions" below for more details.decoderlens_main.py
: the main file to run the DecoderLens experiments, see "usage of code - running DecoderLens" below for more details.
pip install -r requirements.txt
python decoderlens_main.py
Running all intervention concept and desired concept pairs:
python intervention_main.py
Explain the configuration options available to users. Describe each command-line argument or configuration file option and what it does.
-
--fixed_concept
: Set the fixed concept for training or evaluation.- default:
-1
- type:
int
- description: Use
-1
if no fixed concept is desired.
- default:
-
--model_size
: Choose the model size to be used.- default:
'small'
- options: ['small', 'medium', 'large']
- type:
str
- default:
-
--n_prompts_per_concept
: Specify the number of prompts to be generated per concept.- default:
100
- type:
int
- default:
-
--pre_computed_means
: Bool for whether to use pre-computed means, or save the pre-computed means when they are not found.- default:
False
- type:
store_true
(sets toTrue
when the flag is used)
- default:
-
--gpus
: Enable GPU support for training or evaluation.- default:
False
- type:
bool
- action:
store_true
(sets toTrue
when the flag is used)
- default:
-
--save_extremes
: Save the extreme cases, top-k and bottom-k.- default:
False
- type:
bool
- action:
store_true
(sets toTrue
when the flag is used)
- default:
If you have any questions, please feel free to contact us.
Citation:
@article{velezvasquez2024,
title = {Exploring the Inner Mechanisms of Large Generative Music Models},
author = {Vélez Vásquez, M.A. and Pouw, C. and Burgoyne, J.a. and Zuidema, W.},
booktitle = {Proceedings of the 25th International Society for Music Information Retrieval Conference},
year = {2024},
}