Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TRAINING-README appears outdated and inconsistent with current main repository #254

Open
frankiexo opened this issue Feb 9, 2025 · 3 comments

Comments

@frankiexo
Copy link

frankiexo commented Feb 9, 2025

I was misled by the content of TRAINING-README when attempting to train rnnoise, as it references files (rnn_data.c and rnn_data.h) that no longer exist in the main repository. According to a comment from March 27, 2024, these files have been removed:

#219 (comment)

Additionally:

src/compile.sh also appears outdated for the same reason.

Note: I was able to generate features.f32 by instruction in current README which uses foreground noise and background noise (TRANING-README works with one noise only)

@frankiexo
Copy link
Author

I also see reference rnn_data.h to in training/dump_rnn.py
f.write('#ifdef HAVE_CONFIG_H\n#include "config.h"\n#endif\n\n#include "rnn.h"\n#include "rnn_data.h"\n\n')

@richardzhang0301
Copy link

Yeah, I got the same issue. Wondering how would we train a new model using RNNoise 2.x? Thanks.

@frankiexo
Copy link
Author

frankiexo commented Feb 20, 2025

Yeah, I got the same issue. Wondering how we would train a new model using RNNoise 2.x? Thanks.

Just follow instruction in main README file. Pay attention to details. There are some pitfalls. I guess everybody will have different problems. It doesn't belong to this bug report, but there is no forum, so I will answer you here. I have managed to train new models, here are challenges I faced:

  • SSSE3/AVX/AVX2
    Use a system which has this available. At first, I started with virtual Ubuntu, and I was not able to successfully compile and run anything. There are some pull requested intended to fix that, but I would focus on enabling SSSE3/AVX/AVX2. HowTo instructions to enable HW/SW support for virtualization on Windows include register, policy, bcdedit, etc, but the critical step in my case was to disable Credential Guard on UEFI level (not trivial step for standard user)

  • Pay attention to requirement for audio/noise file format PCM/RAW
    ffmpeg -i test.wav -f s16le -acodec pcm_s16le -ar 48000 -ac 1 test.pcm
    did the trick in my case
    ffmpeg -f s16le -ar 48000 -ac 1 -i test-denoised.pcm test-denoised.wav
    to get the result to wav

  • For features.f32 start with something small to test it. It is space demanding
    ./dump_features ./audio/speech.pcm ./audio/backround_noise.pcm ./audio/foreground_noise.pcm features.f32 10000

  • training
    location of train_rnnoise.py is for some reason not mentioned in README.
    It is here: /torch/rnnoise/train_rnnoise.py
    Because it is Torch and Python, I just did the training on Windows.
    Unfortunately, I am not with compatible GPU, so it was running slowly on CPU
    In Windows the parallel computation will fail, but if your change num_workers = 0 in dataloader = torch.utils.data.DataLoader(dataset, batch_size=batch_size, shuffle=True, drop_last=True, num_workers=4) then it runs without any troubles.
    I have tried remote deployment on my wife's Mac mini M2 Pro, it runs 2x faster than my CPU, num_workers=4 works, but it runs faster with num_workers=0

  • use new model
    copy new files rnnoise_data.c and rnnoise_data.h to rnnoise/scr
    FLAGS="-Wall -Wextra -O3 -march=native -DUSE_WEIGHTS_FILE" ./configure --enable-x86-rtcd
    make clean
    make
    ./dump_weights_blob > weights_blob.bin
    copy bin file to folder with rnnoise_demo and test it
    This step was for some reason tricky in my case. I had the default model really persistent.
    But then at some moment the new model started to work

Now I am a bit stuck in training of more precise model with 100000 segments and 100+ epochs
features.f32 has 72 GB
One epoch takes 2,5 hrs.
But for some reason, I have corrupted checkpoints. (actually zip files).
Maybe trouble with remote deployment sync.
But what is important: README doesn't mention it, but you can use parameter --initial-checkpoint
So you can compute just one epoch and continue later.
Eg. train_rnnoise.py --epochs 1 features.f32 output --initial-checkpoint last.pth

When I will be successful, then I will fail as everybody else on fact that the model is not directly compatible as .rnnn file for ffmpeg rnnoise filter. But I need to fail step by step :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants