New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Can you share specifics about the loss for either training phase? #44

Open

altair199797 opened this issue Jan 21, 2025 · 0 comments

altair199797 commented Jan 21, 2025

Dear Authors,

Unfortunately my training is resulting in bad evaluations. Can you roughly share what the supposed loss is for:

(Phase 1) Encoder-Only Knowledge Distillation
(Phase 2) Prompt-in-the-Loop Knowledge Distillation

For Phase 1 my loss is: 0.0010
For Phase 2 my loss is: 0.1139

I suppose Phase 2 gives me the problems. Can you confirm?

Thank you in advance!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment