-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HuggingFace improvements #649
Conversation
- If `None` (default) will not log any artifact. - If `all` will call log_artifact with `output_dir` at each `on_save` call. - If `last` will save the model `on_train_end` and call `log_artifact` with type=model and copy=True.
src/dvclive/huggingface.py
Outdated
self.live.log_artifact(args.output_dir) | ||
output_dir = os.path.join(args.output_dir, "last") | ||
fake_trainer.save_model(output_dir) | ||
self.live.log_artifact(output_dir, type="model", copy=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cross-framework consistency isn't our highest priority, but should we agree on some common principles for the final artifact, like naming and whether to copy it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like for all the integrations to have just 2 options:
-
all/checkpoints
: resuming scenarios.
Log the entire checkpoint folder -
best
: model registry
Log the best checkpoint on end withcopy=True, name="best", type="model"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fine with me. Do you want to update the lightning logger to use copy=True
? AFAIK the rest is consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will update this and lightning to use that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, are you also suggesting to change the behavior of log_model=True
in lightning to track only the copied best artifact and not the whole directory? That's fine, just want to make sure I understand what you mean.
For HF, how should we handle the last/best checkpoint? If args.load_best_model_at_end
, we could add name=best
? WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, are you also suggesting to change the behavior of log_model=True in lightning to track only the copied best artifact and not the whole directory
I think I would suggest dropping the boolean value.
For HF, how should we handle the last/best checkpoint? If args.load_best_model_at_end, we could add name=best? WDYT?
Yes, makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I would suggest dropping the boolean value.
I worry doing that and/or not saving the checkpoints dir breaks consistency with mlflow/wandb/etc. in lightning for the sake of consistency across dvclive. I would probably err on the side of sticking with consistency for lightning over consistency for dvclive where they conflict, but we can always make this a follow-up PR if it is taking this off track.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or HF, how should we handle the last/best checkpoint? If args.load_best_model_at_end, we could add name=best? WDYT?
Updated with this behavior
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, dropped last
option in favor of True
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have a few questions where we need to align.
As far as breaking changes, maybe we should go ahead and make the easy ones from the 3.0 checklist and do a release. I think we will need a major version where we transition to log_model
and moving callbacks into frameworks and can't do it all at once anyway. The alternative is likely to branch-based development and only make the breaking changes available on the 3.0 branch. WDYT?
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## main #649 +/- ##
==========================================
+ Coverage 88.06% 88.24% +0.17%
==========================================
Files 43 43
Lines 3042 3088 +46
Branches 260 270 +10
==========================================
+ Hits 2679 2725 +46
+ Misses 324 323 -1
- Partials 39 40 +1
☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for making an example! 🙏Would like to propose some additions if you don´t mind. I can´t comment directly due to the notebook nature, so hopefully this can help. If you are time sensitive, and agree on the proposed additions/changes, happy to submit a contrib.🎺
- Add # Goal/Intro section
Cover the full functionality the example is exploring at the beginning of the notebook
Proposal : Example of fine-tuning sentiment analysis classifier based on imbd and distilbert pretrained model , experiment tracking and results metrics analysis with dvclive and dvc api
-
Add section # Initialize git and dvc in 2nd cell
why? It gives context about the necessary requirements beyond the libraries install that a data scientist needs to run some other HF examples -
Change # Dataset for # Dataset and Tokenization in 3rd cell
The section includes a Tokenization process -
Add # Evaluation metrics section
For consistency / coherence with respect to the process -
Possible discussion L28
Describe in comment whatlog_model
does for user? which seems to be the improvement of the PR? -
Explain # Comparing section
Context : unclear right now whats really going on with naming . I´ve lost a few chapters of this exploring other things . Thinking about sharing in #9709 . When exploring this, a question came into my mind, what is the difference in between dvc.api and dvclive ?
Thanks @SoyGema ! All points make sense to me. I have opened a separate issue to address as a follow-up. I believe it applies to all the examples and not only to the huggingface one. |
I kept the |
a88e9e7
to
07dbe4e
Compare
07dbe4e
to
2c03182
Compare
* dvclive: Add huggingface updates per iterative/dvclive#649 * updates from review
Log
Trainer.args
.Add
log_model
argument.Every framework does its own thing. No strong opinions but I went with the following:
None
(default) will not log any artifact.all
will call log_artifact withoutput_dir
at eachon_save
call.True
will save the modelon_train_end
and calllog_artifact
with type=model and copy=True.Will use
best
as name ifargs.load_best_model_at_end
elselast
.Add Notebook/Colab example
Closes #641
TODO: