Skip to content



Repository files navigation

CS224N Final Project


Music artists have composed pieces that are both creative and precise. For example, classical music is well-known for its meticulous structure and emotional effect. Recurrent Neural Networks (RNNs) are powerful models that have achieved excellent performance on difficult learning tasks having temporal dependencies. We propose generative RNN models that create sheet music with well-formed structure and stylistic conventions without predefining music composition rules to the models.

Related Deliverables


Youtube Survey Videos

Song Samples


  • - crawls the Internet for .mid files.
    • Flags: -u: url, -f: output folder name, -d: crawl depth, -r: crawl regEx rules
  • - utility script for midi preprocessing.

Useful Websites

Example .abc format



Data Encoding Structure

The numpy array representing each sample is composed of two parts: the metadata and the song.

The first 7 integers in the numpy array are the metadata. They are, in order: song type (R), time signature (M), note unit size (L), number of flats (K), song mode (K), length, complexity.

Length is calculated by counting the distinct number of times the character '|' appears in a file, and complexity is calculated by (number of notes in a song) x 100/(len x number of beats in a measure). In other words, the complexity measure is trying to estimate how busy a song is.

Description .abc Tag Dimensions Examples (Top 3)
Song Type Song Genre R 16 Reel, Jig, Hornpipe
Time Signature Specifies how many beats are in each bar and which note value gets one beat M 15 4/4, 6/8, 3/4
Note Unit Size Specifies which note value gets one beat in the text file L 3 1/8, 1/4, 1/16
Number of Flats Positive for songs with flats, 0 for neutral, negative for songs with sharps K 12 -1, -2, -3
Song Mode 0=Major, 1=Minor, 2=Mixolydian, 3=Dorian, 4=Phrygian, 5=Lydian, 6=Locrian K 6 0, 1, 3
Song Length Number of measures in a song
Song Complexity Busy-ness of a song.

The song portion of the numpy array is 82 dimensions (i.e. 80 music characters and 2 BEGIN/END special characters).

Metadata and Music Encoding Map

>>> pickle.load(open('vocab_map_meta.p'))
{'R': {'jig': 0, 'waltz': 1, 'three-two': 2, 'songair': 3, 'slowair': 4, 'strathspey': 5, 
	'polka': 6, 'air': 7, 'barndance': 8, 'slide': 9, 'slipjig': 10, 'hornpipe': 11, 
	'mazurka': 12, 'reel': 13, 'highlandfling': 14, 'quickstep': 15}, 
'M': {'7/8': 1, '11/8': 2, '5/4': 0, '6/8': 3, '5/8': 4, '4/4': 5, '6/4': 6, '13/8': 7, 
	'3/2': 8, '3/4': 9, '9/8': 10, '12/8': 11, '2/2': 12, '9/4': 13, '2/4': 14}, 
'L': {'1/4': 0, '1/16': 1, '1/8': 2}, 
'K_key': {'-5': 0, '-4': 1, '1': 2, '0': 3, '3': 4, '-6': 5, '-1': 6, '4': 7, '-3': 8, 
	'-2': 9, '2': 10, '5': 11}, 
'K_mode': {'1': 0, '0': 1, '3': 2, '2': 3, '5': 4, '4': 5}}
>>> pickle.load(open('vocab_map_music.p'))
{'!': 0, ' ': 1, '#': 2, "'": 3, '&': 4, ')': 5, '(': 6, '+': 7, '*': 8, '-': 9, ',': 10, 
'/': 11, '.': 12, '1': 13, '0': 14, '3': 15, '2': 16, '5': 17, '4': 18, '7': 19, '6': 20, 
'9': 21, '8': 22, ':': 23, '=': 24, '<': 25, '>': 26, 'A': 27, 'C': 28, 'B': 29, 'E': 30, 
'D': 31, 'G': 32, 'F': 33, 'H': 34, 'K': 35, 'J': 36, 'M': 37, 'L': 38, 'O': 39, 'Q': 40, 
'P': 41, 'S': 42, 'R': 43, 'U': 44, 'T': 45, 'V': 46, '[': 47, ']': 48, '\\': 49, '_': 50, 
'^': 51, 'a': 52, 'c': 53, 'b': 54, 'e': 55, 'd': 56, 'g': 57, 'f': 58, 'i': 59, 'h': 60, 
'j': 61, 'm': 62, 'l': 63, 'o': 64, 'n': 65, 'p': 66, 's': 67, 'r': 68, 'u': 69, 't': 70, 
'w': 71, 'v': 72, 'y': 73, 'x': 74, '{': 75, 'z': 76, '}': 77, '|': 78, '~': 79}


A repo for CS224N Final Project






No releases published


No packages published


  • Python 100.0%