Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Funcodec #3

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open

Add Funcodec #3

wants to merge 4 commits into from

Conversation

indiejoseph
Copy link

This pull request includes several changes to the codec_bpe/audio_to_codes.py file to enhance the functionality and readability of the code. The main changes involve reformatting argument parsing, adding support for a new model, and updating encoding logic.

Argument Parsing Enhancements:

  • Reformatting of the argument parsing section for better readability.

New Model Support:

  • Added support for the Funcodec model, including new arguments and handling logic. [1] [2]

Encoding Logic Updates:

  • Updated the sample rate logic to include the Funcodec model.
  • Modified the encoding logic to handle the Funcodec model.

@AbrahamSanders
Copy link
Owner

Hey @indiejoseph thanks for this contribution! I'll review it soon.

@AbrahamSanders
Copy link
Owner

@indiejoseph in testing it out I get this error:

  File "/home/codec-bpe/codec_bpe/audio_to_codes.py", line 115, in <module>
    model = Speech2Token(config_file, model_pth, device=device)
  File "/home/anaconda3/envs/dev/lib/python3.9/site-packages/funcodec/bin/codec_inference.py", line 68, in __init__
    with open(config_file, "rt", encoding="utf-8") as f:
FileNotFoundError: [Errno 2] No such file or directory: 'facebook/funcodec-base-8bit/config.yaml'

I'm not familiar with how downloading works with ModelScope - ideally it should work like in transformers, where the model is automatically downloaded from the huggingface hub if it does not exist. Alternatively to implementing this directly in audio_to_codes, instructions to download the model in the readme would be sufficient.

@indiejoseph
Copy link
Author

Oh sorry, coz the it required to download the model in the working folder, and I have not add the instruction or other information into the README.md, this is my fault. I will update the PR. And it doesnt work like how transformers does, that have to download manually.
https://huggingface.co/alibaba-damo/audio_codec-encodec-zh_en-general-16k-nq32ds640-pytorch

@AbrahamSanders
Copy link
Owner

@indiejoseph thanks - please add a short instruction to the Readme on how to download the model. I'll review once that's in.

Add FunCodec usage
@indiejoseph
Copy link
Author

I've added a section into README.md, please check

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants