Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

T5 support #1

Open
wants to merge 15 commits into
base: main
Choose a base branch
from
Open

T5 support #1

wants to merge 15 commits into from

Conversation

davidkaczer
Copy link
Owner

What does this PR do?

This PR adds support for pretrained LongT5 encoder-decoder models from HuggingFace.

General Changes

  • Add HuggingFacePretrainedEncoderDecoderModel class for loading LongT5
  • Add span masking collator ported from official Google implementation
  • Add test for span masking collator
  • Add example training configs for LongT5
  • Add logic to distinguish between decoder-only and encoder-decoder models when passing inputs

Breaking Changes

  • None intended

Checklist before submitting final PR

  • My PR is minimal and addresses one issue in isolation
  • I have merged the latest version of the target branch into this feature branch
  • I have reviewed my own code w.r.t. correct implementation, missing type hints, proper documentation, etc.
  • I have run a sample config for model training
  • I have checked that all tests run through (python tests/tests.py) (apparently some tests were failing already in upstream? also didn't test multi-GPU)
  • I have updated the internal changelog (CHANGELOG_DEV.md)

davidkaczer and others added 15 commits August 8, 2024 10:53
- add SpanMaskingCollateFn for span denoising objective
- support loading T5 checkpoint from huggingface
- support passing decoder inputs to T5 model
- add example config for pretraining T5 checkpoint from HF
…ew huggingface model wrapper for encoder decoder architecture
- add logic to model_predict_batch
- fix example config
- cleanup HF model implementation
- remove dependency in span masking collator
- refactor t5 config
- refactor collator test
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant