Reference annotation datasets containing single harmony annotations are at the core of a wide range of studies in Music Information Retrieval and related fields. However, a lot of properties of music are subjective, and annotator subjectivity found among multiple reference annotations is (usually) not taken into account.
Currently available chord-label annotation datasets containing more than one reference annotation are limited by size, sampling strategy, or lack of a standardized encoding.
Therefore, to advance research into annotator subjectivity and computational harmony (such as Automatic Chord Estimation), we release the Chordify Annotator Subjectivity Dataset (CASD), containing multiple expert reference annotations.
This respository releases the Chordify Annotator Subjectivity Dataset, containing reference annotations for:
- Fifty songs from the Billboard dataset [1] that
- have a stable on-line presence in widely accessible music repositories
- can be compared against the Billboad annotations
- Each song is annotated by four expert annotators
- The annotations are encoded in JAMS format [2]
- Chord labels are encoded in standard Harte et al. syntax [3]
- Annotations include reported difficulty (on a 5 point Likert scale, where 1 is easy and 5 is hard) and annotation time (in minutes) for each annotator
(pip) install the JAMS python module to read the annotations. To work with the annotations, load an annotation file:
jam = jams.load('12.jams')
To access the annotations from the first annotator:
jam['annotations'][0]['data']
For further details on how to manupulate and work with JAMS files, we refer to the JAMS documentation.
We find that within the CASD, annotators disagree about chord labels. The next figure gives a nice intuitive idea of the disagreement.
This figure shows the chromagram of the annotators for song 92 in the dataset. The horizontal axis represents time, the vertical axis represents the 12 pitch classes of a single octave. The figure shows that the annotators differ in level of detail in time, as well as in pitch classes per chord. This figure was generated with this script.
If you are interested in a detailed analysis of the annotator subjectivity found in the CASD, please refer to our publication in the Journal of New Music Research:
Hendrik Vincent Koops, W. Bas de Haas, John Ashley Burgoyne, Jeroen Bransen, Anna Kent-Muller & Anja Volk (2019) Annotator subjectivity in harmony annotations of popular music, Journal of New Music Research, 48:3, 232-252, DOI: 10.1080/09298215.2019.1613436
@article{doi:10.1080/09298215.2019.1613436,
author = {Hendrik Vincent Koops and W. Bas de Haas and John Ashley Burgoyne and Jeroen Bransen and Anna Kent-Muller and Anja Volk},
title = {Annotator subjectivity in harmony annotations of popular music},
journal = {Journal of New Music Research},
volume = {48},
number = {3},
pages = {232-252},
year = {2019},
publisher = {Routledge},
doi = {10.1080/09298215.2019.1613436},
URL = {https://doi.org/10.1080/09298215.2019.1613436},
eprint = {https://doi.org/10.1080/09298215.2019.1613436}
}
Please cite this publication if you use the CASD in your research.
By way of this repository and JAMS, we encourage the Music Information Retrieval community to exchange, update, and expand the dataset.
We are more than happy to add your annotations to this dataset. If you are interested in contributing, please keep in mind how these annotations were obtained (see: Data collection method below). Using the same data collection methods ensures keeping all the annotations in the dataset uniform and comparable.
To contribute, submit a pull request. Please send us an email for questions if you have questions on our code of conduct, of if the process for submitting pull requests is unclear.
To ensure the annotators were all focused on the same task, we provided them with a guideline for the annotating process. We asked them to listen to the songs as if they wanted to play the song on their instrument in a band, and to transcribe the chords with this purpose in mind. They were instructed to assume that the band would have a rhythm section (drum and bass) and melody instrument (e.g., a singer). Therefore, their goal was to transcribe the complete harmony of the song in a way that, in their view, best matched their instrument.
We used a web interface to provide the annotators with a central, unified transcription method. This interface provided the annotators with a grid of beat-aligned elements, which we manually verified for correctness. Chord labels could be chosen for each beat. The standard YouTube web player was used to provide the reference recording of the song. Through the interface, the annotators were free to select any chord of their choice for each beat. While transcribing, the annotators were able to watch and listen not only to the YouTube video of the song, but also a synthesized version of their chord transcription.
In addition to providing chords and information about their musical background, we asked the annotators to provide for each song a difficulty rating on a scale of 1 (easy) to 5 (hard), the amount of time it took them to annotate the song in minutes, and any remarks they might have on the transcription process.
The Chordify Annotator Subjectivity Dataset was introduced at the late breaking session at the 18th International Society for Music Information Retrieval Conference. For more information about the CASD and annotator subjectivity in this dataset, please find the poster and extended abstract below.
In a paper published in the Journal of New Music Research, we provide background information and a statistical analysis of annotator subjectivity in the CASD:
- Hendrik Vincent Koops - Utrecht University
- W. Bas de Haas - Chordify
- Jeroen Bransen - Chordify
- John Ashley Burgoyne - University of Amsterdam
- Anja Volk - Utrecht University
Questions can be addressed to [email protected].
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
We thank all annotators for contributing to the project.
[1] John Ashley Burgoyne, Jonathan Wild, and Ichiro Fujinaga, An Expert Ground Truth Set for Audio Chord Recognition and Music Analysis, in Proceedings of the 12th International Society for Music Information Retrieval Conference, pp. 633-38, 2011
[2] Humphrey, Eric J., Justin Salamon, Oriol Nieto, Jon Forsyth, Rachel M. Bittner, and Juan Pablo Bello. JAMS: A JSON Annotated Music Specification for Reproducible MIR Research. In Proceedings of the International Society for Music Information Retrieval Conference, pp. 591-596, 2014.
[3] Harte, C., Sandler, M. B., Abdallah, S. A., & Gómez, E. Symbolic Representation of Musical Chords: A Proposed Syntax for Text Annotations. In Proceedings of the International Society for Music Information Retrieval Conference, pp. 66-71, 2005