Skip to content

Commit

Permalink
reversed logic to ensure checkpointing is unsharded for single accele…
Browse files Browse the repository at this point in the history
…rator
  • Loading branch information
peter-sk committed Dec 21, 2024
1 parent 311286c commit 539f64a
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion olmo/train.py
Original file line number Diff line number Diff line change
Expand Up @@ -1251,7 +1251,7 @@ def on_trace_ready(p):
stop_at = min(stop_at, self.global_step + extra_steps)

# Maybe save sharded checkpoint.
if self.cfg.distributed_strategy != DistributedStrategy.ddp:
if self.cfg.distributed_strategy == DistributedStrategy.fsdp:
if save_checkpoints and (
cancel_initiated
or (
Expand Down

0 comments on commit 539f64a

Please sign in to comment.