Releases: FlagOpen/FlagScale
Releases · FlagOpen/FlagScale
v0.6.0
- Introduced general multi-dimensional heterogeneous parallelism and CPU-based communication between different chips.
- Added comprehensive support for data processing and faster distributed training of LLaVA-OneVision, achieving SOTA results on the Infinity-MM dataset.
- Open-sourced the optimized CFG implementation and accelerated the generation and understanding tasks for Emu3.
- Implemented the auto-tuning feature to simplify large-scale distributed training, making it more accessible for users with less expertise.
- Enhanced the CI/CD system to facilitate more efficient unit testing across different backends and perform the loss check for the various parallel strategies.
v0.3
v0.2
- Provide the actually used training scheme for Aquila2-70B-Expr, including the parallel strategies, optimizations and hyper-parameter settings.
- Support heterogeneous training on chips of different generations with the same architecture or compatible architectures, including NVIDIA GPUs and Iluvatar CoreX chips.
- Support training on chinese domestic hardwares, including Iluvatar CoreX and Baidu KUNLUN chips.