Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Layer extracted by grafzahl #27

Open
LuigiC72 opened this issue Jan 25, 2024 · 1 comment
Open

Layer extracted by grafzahl #27

LuigiC72 opened this issue Jan 25, 2024 · 1 comment

Comments

@LuigiC72
Copy link

Given the discussion about which layer keeping as a token's representation in a down-streaming analysis (Jawahar et al., 2019; Ethayarajh, 2019) when for example using a pre-trained bert model, I was wondering if you are considering to allow to the user the possibility to select one specific layer when fine-tuning a Transformer via grafzahl. At the moment, which layer is consider when for example I specify [model_name = "bert-base-uncased"]? Thanks for your great package!

@chainsawriot
Copy link
Collaborator

chainsawriot commented Jan 25, 2024

By default, grafzahl uses almost the same default as the underlying simpletransformers, i.e. no freezing and all layers might get finetuned.

If you really want to freeze some layers, it is possible to do that with simpletransformers; but unfortunately, not possible with grafzahl. If you want to customize the finetuning at the layer level, I think you would be better off using simpletransformers or even transformers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants