Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Whisper fine tuning - which layers are trained? #2142

Open
chungvle opened this issue Jun 12, 2024 · 0 comments
Open

Whisper fine tuning - which layers are trained? #2142

chungvle opened this issue Jun 12, 2024 · 0 comments

Comments

@chungvle
Copy link

Thanks for the detailed blog post on "Fine-Tune Whisper For Multilingual ASR with 🤗 Transformers". After going through the article and also created a fine-tune model for my own application, I have the following questions, I hope someone can help:

  1. When using 🤗 Trainer with the Seq2SeqTrainingArguments, which layer(s) are trained?
  • only the linear output layer
  • last two layers + last transformer block
  • all layers
  1. Is it possible to specify which layers to train and which to freeze? Some code samples would be appreciated.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant