Skip to content
This repository has been archived by the owner on Nov 1, 2024. It is now read-only.

QA about continue training on checkpoint #757

Open
robinzixuan opened this issue Aug 17, 2024 · 0 comments
Open

QA about continue training on checkpoint #757

robinzixuan opened this issue Aug 17, 2024 · 0 comments
Labels
question Further information is requested

Comments

@robinzixuan
Copy link

❓ Questions and Help

Before asking:

  1. search the issues.
  2. search the docs.

What is your question?

  1. Is it possible to help the checkpoints of OPT-1.3b model around 10K - 20K training step?
  2. By the way, if we want continue training on those checkpoint/ official checkpoint, is it possible to get the all training dataset meta used in OPT models?

Code

What have you tried?

What's your environment?

  • metaseq Version (e.g., 1.0 or master):
  • PyTorch Version (e.g., 1.0)
  • OS (e.g., Linux):
  • How you installed metaseq (pip, source):
  • Build command you used (if compiling from source):
  • Python version:
  • CUDA/cuDNN version:
  • GPU models and configuration:
  • Any other relevant information:
@robinzixuan robinzixuan added the question Further information is requested label Aug 17, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

1 participant