Whisper fine tuning - which layers are trained? #2142

chungvle · 2024-06-12T21:36:51Z

Thanks for the detailed blog post on "Fine-Tune Whisper For Multilingual ASR with 🤗 Transformers". After going through the article and also created a fine-tune model for my own application, I have the following questions, I hope someone can help:

When using 🤗 Trainer with the Seq2SeqTrainingArguments, which layer(s) are trained?

only the linear output layer
last two layers + last transformer block
all layers

Is it possible to specify which layers to train and which to freeze? Some code samples would be appreciated.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Whisper fine tuning - which layers are trained? #2142

Whisper fine tuning - which layers are trained? #2142

chungvle commented Jun 12, 2024

Whisper fine tuning - which layers are trained? #2142

Whisper fine tuning - which layers are trained? #2142

Comments

chungvle commented Jun 12, 2024