Skip to content

Model Creation

Abram B. Fleishman edited this page Feb 3, 2021 · 1 revision

Basic concepts

Training vs Cross-validation data

  • A model needs both training and cross-validation datasets, and both need to contain a good number of positive and negative examples.
  • Training data is used for the model to actually train itself.
  • Cross-validation data is used to evaluate different model iterations for performance.
  • A good analogy to use is that of a number of students (model iterations) all studying from the same study guide (training data) and then taking the same exam (cross-validation data).

Overview for making a brand-new model

  • Create and audit a mothership
  • Populate additional audited figs if available
  • Split into training and cross-validation data
  • Train model
  • Evaluate model performance

Mothership Creation and Auditing

  • A "mothership" is a sample of the soundscape(s) you are building a model from, ideally containing both positive examples of the sound you are attempting to detect as well as negative examples of other sounds present.
  • Our convention is to have motherships that are roughly 10,000 events in length. This gives us a nice representation of negatives but often does not provide enough positives to build even a first pass model. When this is the case, we will add more positives and negatives from other sources.
  • To make a mothership, you must put together a list of all the filestreams you want to draw that mothership from. This is usually done by writing the output of findExts() to a text file.
  • Once you have a large list of file streams, you sample from them: info = subsampleWavs('[path_to_text_file]',[destination_directory_for_sounds], [what_fraction_of_sounds_to_use], 'uniform');
  • To calculate the fraction of sounds to use, there is a spreadsheet: Analysis/Work_Flow_and_File_Protocol/Subsampling_calculator.xlsx. You will need to feed it the total file size of the audio you plan on subsampling.
  • Once you have subsampled your audio, make it into a fig with auditor().
  • Then audit your fig. It can be helpful to use some of the feature filtering available in CMIAuditor to group your sounds and speed up auditing somewhat. It is also a good idea to be very clear in your model creation script about what ratings you are using and what they mean, especially if you are using ratings other than '5' and '0'.

Infusing additional examples

  • If you have any additional examples (ie from exploratory audits of your data), this is the time to use them.
  • First, establish some useful variables: model version (should be "V1" on your first iteration of the model), Species (the four-letter code for your species), and the modelDate: this is the date on which you start making your model, since the process of creating a model can be a multi-day affair.
  • Then create a set of directories to put data from this model iteration in: a RootDir, a XvalDir in which you put your cross-validation data, and a TrainingDir in which you put your training data.
  • You can find these by using the findMatch() function and searching for an audit tag eg findMatch([directory], 'explore_SPECIES.fig')
  • Do not run the code marked with %% If updating an existing model if you are making the first version of a model.
  • You will need to create a table of the number of events with each label in each fig you plan on adding to the training and xval datasets.
  • Then you split out rated events from each fig into training and xval. This uses the function auditorSplit([fig to split from], [first fig to split to (usually training)], [second fig to split to (usually xval)], [label 1], [how to split label 1], [label 2], [how to split label 2]..., [how to split]).
  • When splitting, you can use either absolute numbers or decimals, such that an auditorSplit line run on a fig with 100 '5' ratings using either '5', [.7 .3] or '5', [70 30] would produce the same output.
  • If you ask auditorSplit to split something that is not possible (eg splitting one '5' into two different output filestreams or splitting a rating that is not actually present), it will throw an error. This is why having a table of label counts for all the figs you split is useful.

Creating Training and Xval data

  • First you need to establish the fig name and file path where your data will be saved: outFigPath... and outFigName...
  • Then you create a directory to spit out all the sound clips that will be used in your training or xval dataset by creating outMothershipSounds....
  • Then, you will take all the 2-second windows identified earlier with auditorSplit() and spit out those sounds into your new sounds directory with fileStreamFromFigs()
  • If you want to change ratings, or refine your audited data, you can then open up your training and cross-validation data with auditor(). auditorMapRatings([old rating], [new rating]) can sometimes be helpful here if you want to change ratings in large batches.
  • Once you have created and modified your training data, do the same for cross-validation data.

Training a model

  • First, load the cross-validation data: data.test(1) = collectFigFeatures(outFigPathXval, {outFigNameXval}, {'auto'}, {labelPos, labelNeg}). You can modify this line in a couple of ways, for example replacing {'auto'} with {'auto' [500 1000]} to have your model search only in the 500-1000Hz range.
  • Then, especially if you're training a V2 or onward of a model, you need to clear out any preexisting training data and tree, which is done in the two if isfield(... statements.
  • Next, create the directory in which the actual model file will be stored: modelDir
  • Define your param.nPosThPct, which is the percent of positive events you want your model to detect: basically defining your desired model sensitivity. It can sometimes be useful to make multiple versions of a model using the same set of training and xval data but varying only this parameter.
  • Then you train your model with treeeBasedDNNSearch. If you placed a frequency restriction in your cross-validation data, you need to use the same frequency restriction here.
  • Once you have your model trained, save it to the modelDir you established earlier.
  • If this is the first version of your model, you should now add information about it to the spreadsheet located in D:\CM,Inc\Dropbox (CMI)\CMI_Team\Analysis\Work_Flow_and_File_Protocol\Audit settings by species.xlsx, in the Models in Progress tab.

Evaluate model performance

  • Two ways to do this, best used in concert: Test dataset and ROC Curves.
  • A "test" dataset is a separate control dataset, sampling from every file stream--much like a mothership. Ideally it will contain a good number of positive and negative examples of varying model probabilities that were not used to train or cross-validate the model. About 25,000 events is a good ballpark number. All events will have to be audited, which can be time-consuming. Once every event has been audited, it can be exported to R and filestream-specific model sensitivity and accuracy can be determined. It is important that you not use positive and negative examples from the test dataset to improve your model, because this will prevent the test dataset from being a good model comparison tool.
  • Receiver-Operator Curve ("ROC" curves) Like This One Are a useful tool for comparing the performance of multiple models on a single file stream. They show a number of plots comparing model probability, sensitivity, accuracy, and total number of events. If we number the four plots 1,2,3,and 4 going clockwise, a perfect model (all positives at probability=1, all negatives at probability=0) would have its curve at the top right corner of plot 1 and 2, at the top left corner of plot 3, and at the bottom right corner of plot 4. No model is perfect, but this should provide some guidance towards evaluating a model using these plots.
  • To make a ROC curve, you need to use the dnnSenstivityPlot() function. The function arguments are: 1) a list of model output .mat files, contained in {curly braces}; 2) a .fig file corresponding to those model probability files; 3) a positive rating, usually '5'.