Release InternEvo-v0.4.0dev20240403 · InternLM/InternEvo

What's Changed

feat(internlm): refactor code structure based on InternTrain by @huangting4201 in #82
fix(tokenized/packed_dataset.py): fix packed dataset when train_folder is not None by @huangting4201 in #88
fix(transformers): fix no white space when chatting with fast tokenizer by @x54-729 in #90
Fix(TrainState): fix trainstate batch sampler by @zigzagcai in #102
feat: rm grad profiling by @JiaoPL in #100
fix(train/pipeline.py): fix nan grad norm by @huangting4201 in #103
Fix(QA): fix check ckpt loss by @li126com in #89
improve zero grad communication overlap with pp by @mwiacx in #104
feat(optimizer/hybrid_zero_optim.py): remove two stage compute norm by @huangting4201 in #106
fix(embedding.py): fix triton apply_rotary to rotary_emb version by @sallyjunjun in #105
feat(npu): add Ascend 910B support by @SolenoidWGT in #110
feat(tokenized/dummy_dataset.py): support fixed seqlen for random dataset samples by @huangting4201 in #119
Fix(QA): fix test optimizer and no_fa_output by @li126com in #124
feat(initialize/launch.py): support switch use_packed_dataset by @huangting4201 in #117
fix apply_rotary_torch not inplace problem by @sallyjunjun in #123
fix(npu): refactor split_half_float_double and remove str key by @SolenoidWGT in #131
feat(launch.py): update assert info for use_packed_dataset and fix backend accelerator get error by @huangting4201 in #125
remove global variable internlm_accelerator by @sallyjunjun in #133
fix(gpc): remove unused num_processes_on_current_node by @SolenoidWGT in #136
Fix(support npu): some little bugs for npu support by @li126com in #129
feat(model): extend dim bsz for packed data for standardizing the sp processing dimension by @huangting4201 in #141
Fix(device name): use consist way for get device by @li126com in #139
replace is_cuda with get_accelerator_backend by @sallyjunjun in #143
Feat(npu): change current_time format to adapt npu profiler by @li126com in #147
fix INTERNLM2_PUBLIC by @sallyjunjun in #150
feat(eval): optimize evaluation context and remove DtypeTensor by @huangting4201 in #149
fix(QA): fix test_forward_output_no_fa by @li126com in #151
fix(QA): re-adapt some QA code for new version by @li126com in #146
fix(unpack_data): pad -100 on labels by @sunpengsdu in #154
feat(attn): support npu flash attention by @SolenoidWGT in #145
fix(dummy_dataset): fixed_random_dataset_seqlen default is true by @sunpengsdu in #156
fix(npu): fix attn mask move device by @SolenoidWGT in #159
fix: little bug by @JiaoPL in #160
refactor(moe): expose more interfaces for moe by @blankde in #157
set dummy data fix length false in ci by @sunpengsdu in #163
feat(mlp): support mlp layer fusion by @SolenoidWGT in #161
feat(deeplink): add deeplink as new backend by @caikun-pjlab in #168
fix(optimizer): skip param with requires grad is False by @huangting4201 in #169
fix internlm_accelerator by @sallyjunjun in #166
remove timer_diagnosis and bench_gpu by @sallyjunjun in #170
feat(model): support npu with packed data by @huangting4201 in #167
fix(modules/multi_head_attention.py): fix distributed attn argument err in npu by @huangting4201 in #172
fix(utils/logger.py): remove uniscale logger in public repo by @huangting4201 in #118
fix(activation_checkpoint.py): fix rng mode in activation ckpt by @huangting4201 in #177

New Contributors

@JiaoPL made their first contribution in #100
@SolenoidWGT made their first contribution in #110
@caikun-pjlab made their first contribution in #168

Full Changelog: v0.3.3dev20240315...v0.4.0dev20240403

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

InternEvo-v0.4.0dev20240403

What's Changed

New Contributors

Contributors