-
Notifications
You must be signed in to change notification settings - Fork 5
Symbols Contexts
Starting with the annotated score images, we extracted the so-called "full-context" sub-image for each symbol.
As input, we had a score image properly scaled and a list of symbol information (shape name and bounding box within the score image).
For an interline value normalized at 10 pixels, we used a fixed rectangular window (width: 48 pixels, height: 96 pixels) centered on each symbol center.
Note: These numbers were chosen rather arbitrarily and could be modified is so desired. However, the context dimensions must be numbers multiple of 4, to accommodate the two sub-sampling layers used by the current convolutional neural network.
Via the program Features
, we produced a .CSV file to be later used for the training of the CNN.
In this file and for each symbol, we simply added a row that contained:
- The pixel values of the context window, row by row
- The index of the symbol name in the ordered list of symbol names
[This task was not done during the hack]
To fully train a classifier, the representative training set of symbols should contain both valid
symbols and non-valid symbols. The latter ones are named None
-shape symbols.
By construction, MuseScore provides only valid symbols, so we need to generate "artificial" None
symbols.
To do do, we use the valid symbols in a page to compute a population of "occupied" rectangles.
In the remaining areas we try at random to insert artificial rectangles of a predefined size.
Each rectangle successfully inserted gives birth to an artificial None-shaped symbol, whose
bounding box is reduced to a point.
Typically, we try to insert as many None symbols as there are valid symbols in the page at hand.
Nota: this insertion algorithm requires that all valid symbols are described in the Annotations XML file. This is one more reason to have a single Annotations file, even when multiple staff sizes are present in the same page.
To check the features material, we wrote a simple program (Subimages
) that read this .csv file to
generate the corresponding symbol sub-images.
Nota: These images are not needed per se for training the classifier, but are very useful for a visual inspection of the full-context sub-images.