Weihan Xu, Julian McAuley, Taylor Berg-Kirkpatrick, Shlomo Dubnov, Hao-Wen Dong
The 26th conference of the International Society for Music Information Retrieval (ISMIR 2025)
Pretrained Model: Download (Google Drive)
Paper: arXiv PDF
Code: GitHub – MetaScore_Official (codebase branch)
Demo Page: Demo Page
Dataset: Dataset
-
Overview
-
MetaScore Distribution
Leveraging the LLM-enhanced MetaScore dataset, our proposed MetaScore Transformer (MST) model generates symbolic music using natural language prompts with difficulty, genre, instrument and composer controls. The symbolic music outputs allow the user to further edit and complete the composition.
We collect 963K songs paired with musical scores and metadata from the MuseScore forum.
- MetaScore-Raw (963K): The raw MuseScore files and metadata scraped from the MuseScore forum as well as the corresponding musicxml file for future research.
- Metascore-Genre (181K): A subset of MuseScore-Raw containing files with user-annotated genres. Additionally, we discard any songs composed by a composer that has less than 100 compositions in MetaScore-Raw. We also provide LLM-generated captions based on information extracted from the metadata in Metascore-Genre.
- MetaScore-Plus (963K): MetaScore-Raw where missing genre tags are completed by the trained genre tagger.We also provide LLM-generated captions based on information extracted from the metadata in MetaScore-Plus.
Due to copyright concerns, we will publicly release music scores and metadata that are in the public domain (228K) or licensed with a Creative Commons licenses (46K) from MetaScore-Plus. The rest of the dataset will be provided upon request for research purpose.
Weihan Xu: [email protected]