Skip to content

v0.4.3

Compare
Choose a tag to compare
@SimonCqk SimonCqk released this 05 Dec 14:36
· 0 commits to ff346d1174b53fca083ffe27e97a0cf46038576c since this release

Feature

  1. implement elastic training protocal(easyscale) on pytorch.
  2. fault tolerance driven by AIMaster.

Bugfix

  1. sync determination in elastic training.
  2. gang schedule deadlock due to unexpected configuration.