Skip to content

Commit

Permalink
[CLEANING] update readme, organise folders better
Browse files Browse the repository at this point in the history
  • Loading branch information
BenCretois committed May 18, 2023
1 parent afacedf commit a4b9923
Show file tree
Hide file tree
Showing 24 changed files with 19,873 additions and 2,251 deletions.
54 changes: 54 additions & 0 deletions BEATs_on_ECS50/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
# BEATs fine-tuning pipeline :musical_note:

This GitHub repository is made for using [BEATs](https://arxiv.org/abs/2212.09058) on your own dataset and is still a work in progress. In its current form, the repository allow a user to

- Fine-tune BEATs on the [ESC50 dataset](https://github.com/karolpiczak/ESC-50).
- Fine-tune a prototypical network with BEATs as feature extractor on the [ESC50 dataset](https://github.com/karolpiczak/ESC-50).

## Necessary downloads

- Download [BEATs_iter3+ (AS2M) model](https://msranlcmtteamdrive.blob.core.windows.net/share/BEATs/BEATs_iter3_plus_AS2M.pt?sv=2020-08-04&st=2022-12-18T10%3A40%3A53Z&se=3022-12-19T10%3A40%3A00Z&sr=b&sp=r&sig=SKBQMA7MRAMFv7Avyu8a4EkFOlkEhf8nF0Jc2wlYd%2B0%3D)
- Download [ESC50 dataset](https://github.com/karoldvl/ESC-50/archive/master.zip)
- Clone this repo: `git clone https://github.com/NINAnor/rare_species_detections.git`
- Build the docker image:

```bash
cd rare_species_detections
docker build -t beats -f Dockerfile .
```

**Make sure `ESC-50-master` and `BEATs/BEATs_iter3_plus_AS2M.pt` are stored in your `$DATAPATH` (data folder that is exposed to the Docker container)**

## Using the software: fine tuning

Providing that `ESC-50-master` and `BEATs/BEATs_iter3_plus_AS2M.pt` are stored in your `$DATAPATH`:

```bash
docker run -v $PWD:/app \
-v $DATAPATH:/data \
--gpus all `# if you have GPUs available` \
beats \
poetry run fine_tune/trainer.py fit --config /app/config.yaml
```

## Using the software: training a prototypical network

- Create a miniESC50 dataset in your `$DATAPATH`:

```bash
docker run -v $PWD:/app \
-v $DATAPATH:/data \
--gpus all `# if you have GPUs available` \
beats \
poetry run data_utils/miniESC50.py
```

- Train the prototypical network:

```bash
docker run -v $PWD:/app \
-v $DATAPATH:/data \
--gpus all \
beats \
poetry run prototypicalbeats/trainer.py fit --trainer.accelerator gpu --trainer.gpus 1 --data miniESC50DataModule
```
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
225 changes: 195 additions & 30 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,54 +1,219 @@
# BEATs fine-tuning pipeline :musical_note:
# DCASE2023: FEW-SHOT BIOACOUSTIC EVENT DETECTION USING BEATS

This GitHub repository is made for using [BEATs](https://arxiv.org/abs/2212.09058) on your own dataset and is still a work in progress. In its current form, the repository allow a user to
**Few-shot learning is a highly promising paradigm for sound event detection. It is also an extremely good fit to the needs of users in bioacoustics, in which increasingly large acoustic datasets commonly need to be labelled for events of an identified category** (e.g. species or call-type), even though this category might not be known in other datasets or have any yet-known label. While satisfying user needs, this will also benchmark few-shot learning for the wider domain of sound event detection (SED).

- Fine-tune BEATs on the [ESC50 dataset](https://github.com/karolpiczak/ESC-50).
- Fine-tune a prototypical network with BEATs as feature extractor on the [ESC50 dataset](https://github.com/karolpiczak/ESC-50).
<p align="center"><img src="images/VM.png" alt="figure" width="400"/></p>

## Necessary downloads
**Few-shot learning describes tasks in which an algorithm must make predictions given only a few instances of each class, contrary to standard supervised learning paradigm.** The main objective is to find reliable algorithms that are capable of dealing with data sparsity, class imbalance and noisy/busy environments. Few-shot learning is usually studied using N-way-K-shot classification, where N denotes the number of classes and K the number of examples for each class.

- Download [BEATs_iter3+ (AS2M) model](https://msranlcmtteamdrive.blob.core.windows.net/share/BEATs/BEATs_iter3_plus_AS2M.pt?sv=2020-08-04&st=2022-12-18T10%3A40%3A53Z&se=3022-12-19T10%3A40%3A00Z&sr=b&sp=r&sig=SKBQMA7MRAMFv7Avyu8a4EkFOlkEhf8nF0Jc2wlYd%2B0%3D)
- Download [ESC50 dataset](https://github.com/karoldvl/ESC-50/archive/master.zip)
- Clone this repo: `git clone https://github.com/NINAnor/rare_species_detections.git`
- Build the docker image:
> Text in this section is borrowed from [c4dm/dcase-few-shot-bioacoustic](https://github.com/c4dm/dcase-few-shot-bioacoustic)
## Our contribution:

This repository is the result of our submission to the DCASE2023 challenge task5: *Few-shot Bioacoustic Event Detection*. It containts the necessary code to train a prototypical network with BEATs as feature extractor on the data given by the DCASE challenge.

This repository's main objective is to keep active to tackle future DCASE challenges, if you wish to help us improve this repository / collaborate with us, please do not hesitate to send us a message!

## Requirements

In this section are listed the requirements. Note that we make extensive use of [Docker](https://docs.docker.com/get-docker/) for easier reproducibility.

### Download

- The model: [BEATs_iter3+ (AS2M) model](https://msranlcmtteamdrive.blob.core.windows.net/share/BEATs/BEATs_iter3_plus_AS2M.pt?sv=2020-08-04&st=2022-12-18T10%3A40%3A53Z&se=3022-12-19T10%3A40%3A00Z&sr=b&sp=r&sig=SKBQMA7MRAMFv7Avyu8a4EkFOlkEhf8nF0Jc2wlYd%2B0%3D)**
- The development dataset: [DCASE Development dataset](https://dcase.community/challenge2023/task-few-shot-bioacoustic-event-detection#development-set)
- The evaluation dataset: [DCASE Evaluation dataset](https://dcase.community/challenge2023/task-few-shot-bioacoustic-event-detection#evaluation-set)


Once the necessary files are download, place them in a folder `data` so that the structure is similar as the structure displayed below. Note that the folder structure is important as some path were hardcoded to facilitate reproducibility and speed of workflow.

Note that the `Evaluation_Set` is not exactly of the same structure as `Training/Validation_Set` and some manual manipulation is needed.

```bash
.
|-BEATs
|-DCASE
|---Development_Set
|-----Evaluation_Set
|-------CHE
|-------CHE23
|-------CT
|-------CW
|-------DC
|-------MGE
|-------MS
|-------QU
|-----Training_Set
|-------BV
|-------HT
|-------JD
|-------MT
|-------WMW
|-----Validation_Set
|-------HB
|-------ME
|-------PB
|---Development_Set_annotations
|-----Evaluation_Set
|-------Annotations_only
|---------CHE
|---------CHE23
|---------CT
|---------CW
|---------DC
|---------MGE
|---------MS
|---------QU
|-------__MACOSX
|---------Annotations_only
|-----------CHE
|-----------CHE23
|-----------CT
|-----------CW
|-----------DC
|-----------MGE
|-----------MS
|-----------QU
|-----Training_Set
|-------BV
|-------HT
|-------JD
|-------MT
|-------WMW
|-----Validation_Set
|-------HB
|-------ME
|-------PB
```

### Setup

Once the necessary files have been dowloaded, clone the repository:

```bash
cd ~
git clone https://github.com/NINAnor/rare_species_detections.git
```

It is now possible to build the Docker image:

```bash
cd rare_species_detections
docker build -t beats -f Dockerfile .
docker build -t dcase -f Dockerfile .
```

**Make sure `ESC-50-master` and `BEATs/BEATs_iter3_plus_AS2M.pt` are stored in your `$DATAPATH` (data folder that is exposed to the Docker container)**
You should now be ready to run the scripts!

## Using the software: fine tuning
## Training the prototypical model

Providing that `ESC-50-master` and `BEATs/BEATs_iter3_plus_AS2M.pt` are stored in your `$DATAPATH`:
First we need to parse the `Development_Set` using `data_utils/DCASEfewshot.py`. Please change the path as needed.

```bash
docker run -v $PWD:/app \
-v $DATAPATH:/data \
--gpus all `# if you have GPUs available` \
beats \
poetry run fine_tune/trainer.py fit --config /app/config.yaml
DATA_DIR=/home/data/
CODE_DIR=/home/rare_species_detections

docker run -v $CODE_DIR:/app \
-v $DATA_DIR:/data \
--gpus all \
dcase \
poetry run python /app/data_utils/DCASEfewshot.py \
--set_type Training_Set \
--status train \
--overwrite \
--resample \
--denoise \
--normalize \
--tensor_length 128 \
--overlap 0.5
```

## Using the software: training a prototypical network
The script creates a folder `DCASEfewshot/train` in your `$DATA_DIR`. It contains the parsed data that will be used to train the prototypical network.

- Create a miniESC50 dataset in your `$DATAPATH`:
It is now possible to train the network using `prototypicalbeats/trainer.py`:

```bash
docker run -v $CODE_DIR:/app \
-v $DATA_DIR:/data \
--gpus all \
dcase \
poetry run prototypicalbeats/trainer.py fit \
--trainer.accelerator gpu \
--trainer.gpus 1 \
--model.distance euclidean
```

The training script should create a folder (`lightning_logs/`) in which the model weights (`version_X/checkpoints/*.ckpt`) and the training configuration (`version_X/checkpoints/config.yaml`) are stored.

## Using the model on the Validation / Evaluation dataset

To run the prediction use the script `evaluate/evaluateDCASE.py`. Note that the file `evaluate/config_evaluation.yaml` contains parameters that needs to be changed depending on the need. In particular you will need to change the `model_path` and `status` (either `test` or `validate`).

```bash
docker run -v $PWD:/app \
-v $DATAPATH:/data \
--gpus all `# if you have GPUs available` \
beats \
poetry run data_utils/miniESC50.py
docker run -v $CODE_DIR:/app \
-v $DATA_DIR:/data \
--gpus all \
dcase \
poetry run python /app/evaluate/evaluateDCASE.py \
--wav_save \
--overwrite
```

- Train the prototypical network:
`evaluateDCASE.py` creates a result file `eval_out.csv` in `$DATA_DIR` containing all the detections made the model. If `--wav_save` is specified, the script will also return a `.wav` file for all files containing additional channels: the ground truth labels, the predicted labels, the distance to the POS prototype and finally the p-values. The `.wav` file can be opened in [Audacity](https://www.audacityteam.org/) to be inspected more closely.

## Computing the resulting metrics

Once the `eval_out.csv` has been created, it is possible to get the results for our approach. Note that the metrics can only be computed for the `Validation_Set` as it contains all ground truth labels as opposed to the `Evaluation_Set` for which only the 5 first samples of the POS class are labelled.

```bash
docker run -v $PWD:/app \
-v $DATAPATH:/data \
docker run -v $CODE_DIR:/app \
-v $DATA_DIR:/data \
--gpus all \
beats \
poetry run prototypicalbeats/trainer.py fit --trainer.accelerator gpu --trainer.gpus 1 --data miniESC50DataModule
dcase \
poetry run python evaluation/evaluation_metrics/evaluation.py \
-pred_file /data/eval_out.csv \
-ref_files_path /data/DCASE/Development_Set_annotations/Validation_Set \
-team_name BEATs \
-dataset VAL \
-savepath /data/.
```

The results we obtained:

```
Evaluation for: BEATs VAL
BUK1_20181011_001004.wav {'TP': 15, 'FP': 35, 'FN': 16, 'total_n_pos_events': 31}
BUK1_20181013_023504.wav {'TP': 2, 'FP': 258, 'FN': 22, 'total_n_pos_events': 24}
BUK4_20161011_000804.wav {'TP': 1, 'FP': 30, 'FN': 46, 'total_n_pos_events': 47}
BUK4_20171022_004304a.wav {'TP': 7, 'FP': 17, 'FN': 10, 'total_n_pos_events': 17}
BUK5_20161101_002104a.wav {'TP': 31, 'FP': 7, 'FN': 57, 'total_n_pos_events': 88}
BUK5_20180921_015906a.wav {'TP': 4, 'FP': 24, 'FN': 19, 'total_n_pos_events': 23}
ME1.wav {'TP': 9, 'FP': 18, 'FN': 2, 'total_n_pos_events': 11}
ME2.wav {'TP': 41, 'FP': 27, 'FN': 0, 'total_n_pos_events': 41}
R4_cleaned recording_13-10-17.wav {'TP': 19, 'FP': 14, 'FN': 0, 'total_n_pos_events': 19}
R4_cleaned recording_16-10-17.wav {'TP': 30, 'FP': 8, 'FN': 0, 'total_n_pos_events': 30}
R4_cleaned recording_17-10-17.wav {'TP': 36, 'FP': 9, 'FN': 0, 'total_n_pos_events': 36}
R4_cleaned recording_TEL_19-10-17.wav {'TP': 52, 'FP': 12, 'FN': 2, 'total_n_pos_events': 54}
R4_cleaned recording_TEL_20-10-17.wav {'TP': 64, 'FP': 8, 'FN': 0, 'total_n_pos_events': 64}
R4_cleaned recording_TEL_23-10-17.wav {'TP': 84, 'FP': 8, 'FN': 0, 'total_n_pos_events': 84}
R4_cleaned recording_TEL_24-10-17.wav {'TP': 99, 'FP': 14, 'FN': 0, 'total_n_pos_events': 99}
R4_cleaned recording_TEL_25-10-17.wav {'TP': 99, 'FP': 9, 'FN': 0, 'total_n_pos_events': 99}
file_423_487.wav {'TP': 57, 'FP': 13, 'FN': 0, 'total_n_pos_events': 57}
file_97_113.wav {'TP': 11, 'FP': 27, 'FN': 109, 'total_n_pos_events': 120}
Overall_scores: {'precision': 0.2911279078300433, 'recall': 0.4938446186969832, 'fmeasure (percentage)': 36.631}
```

## Taking the idea further:

- Computing the mahalanobis distance instead of the Euclidean distance
- Implementing a p-value filtering to detect outlier distances from the prototypes

## Acknowlegment and contact

For bug reports please use the [issues] section (https://github.com/NINAnor/rare_species_detections/issues).

For other inquiries please contact [Benjamin Cretois](mailto:[email protected]) or [Femke Gelderblom](mailto:[email protected])

## Cite this work

Technical report soon to be available.
2 changes: 1 addition & 1 deletion data_utils/DCASEfewshot.py
Original file line number Diff line number Diff line change
Expand Up @@ -547,7 +547,7 @@ def preprocess_df(df):

parser.add_argument(
"--status",
help=" 'train' or 'validate' ",
help=" 'train' or 'validate' or 'test'",
default="train",
required=False,
type=str,
Expand Down
File renamed without changes.
82 changes: 0 additions & 82 deletions data_utils/parse_data_dcase.py

This file was deleted.

Binary file not shown.
Loading

0 comments on commit a4b9923

Please sign in to comment.