Skip to content

Commit

Permalink
[cherry pick] Update README (#8681) (#8727)
Browse files Browse the repository at this point in the history
* update

* update readme

* update

* update

* update

* update

* update

* update

* update

* update README(EN)
  • Loading branch information
DrownFish19 authored Jul 8, 2024
1 parent 3b739d6 commit e773524
Show file tree
Hide file tree
Showing 2 changed files with 54 additions and 16 deletions.
47 changes: 33 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,13 +31,13 @@

## News 📢

* **2024.06.27 [PaddleNLP v3.0 Beta](https://github.com/PaddlePaddle/PaddleNLP/releases/tag/v3.0.0)**:拥抱大模型,体验全升级。统一大模型工具链,实现国产计算芯片全流程接入;全面支持飞桨4D并行配置、高效精调策略、高效对齐算法、高性能推理等大模型产业级应用流程;自研极致收敛的RsLoRA+算法、自动扩缩容存储机制Unified Checkpoint和通用化支持FastFFN、FusedQKV助力大模型训推;主流模型持续支持更新,提供高效解决方案。
* **2024.06.27 [PaddleNLP v3.0 Beta](https://github.com/PaddlePaddle/PaddleNLP/releases/tag/v3.0.0-beta0)**:拥抱大模型,体验全升级。统一大模型工具链,实现国产计算芯片全流程接入;全面支持飞桨4D并行配置、高效精调策略、高效对齐算法、高性能推理等大模型产业级应用流程;自研极致收敛的RsLoRA+算法、自动扩缩容存储机制Unified Checkpoint和通用化支持FastFFN、FusedQKV助力大模型训推;主流模型持续支持更新,提供高效解决方案。

* **2024.04.24 [PaddleNLP v2.8](https://github.com/PaddlePaddle/PaddleNLP/releases/tag/v2.8.0)**:自研极致收敛的RsLoRA+算法,大幅提升PEFT训练收敛速度以及训练效果;引入高性能生成加速到RLHF PPO算法,打破 PPO 训练中生成速度瓶颈,PPO训练性能大幅领先。通用化支持 FastFFN、FusedQKV等多个大模型训练性能优化方式,大模型训练更快、更稳定。

* **2024.01.04 [PaddleNLP v2.7](https://github.com/PaddlePaddle/PaddleNLP/releases/tag/v2.7.1)**: 大模型体验全面升级,统一工具链大模型入口。统一预训练、精调、压缩、推理以及部署等环节的实现代码,到 `PaddleNLP/llm`目录。全新[大模型工具链文档](https://paddlenlp.readthedocs.io/zh/latest/llm/finetune.html),一站式指引用户从大模型入门到业务部署上线。自动扩缩容存储机制 Unified Checkpoint,大大提高大模型存储的通用性。高效微调升级,支持了高效微调+LoRA同时使用,支持了QLoRA等算法。
* **2024.01.04 [PaddleNLP v2.7](https://github.com/PaddlePaddle/PaddleNLP/releases/tag/v2.7.1)**: 大模型体验全面升级,统一工具链大模型入口。统一预训练、精调、压缩、推理以及部署等环节的实现代码,到 `PaddleNLP/llm`目录。全新[大模型工具链文档](https://paddlenlp.readthedocs.io/zh/latest/llm/pretraining/index.html),一站式指引用户从大模型入门到业务部署上线。自动扩缩容存储机制 Unified Checkpoint,大大提高大模型存储的通用性。高效微调升级,支持了高效微调+LoRA同时使用,支持了QLoRA等算法。

* **2023.08.15 [PaddleNLP v2.6](https://github.com/PaddlePaddle/PaddleNLP/releases/tag/v2.6.0)**: 发布[全流程大模型工具链](./llm),涵盖预训练,精调,压缩,推理以及部署等各个环节,为用户提供端到端的大模型方案和一站式的开发体验;内置[4D并行分布式Trainer](./docs/trainer.md)[高效微调算法LoRA/Prefix Tuning](./llm#33-lora), [自研INT8/INT4量化算法](./llm#6-量化)等等;全面支持[LLaMA 1/2](./llm/llama), [BLOOM](.llm/bloom), [ChatGLM 1/2](./llm/chatglm), [GLM](./llm/glm), [OPT](./llm/opt)等主流大模型
* **2023.08.15 [PaddleNLP v2.6](https://github.com/PaddlePaddle/PaddleNLP/releases/tag/v2.6.0)**: 发布[全流程大模型工具链](./llm),涵盖预训练,精调,压缩,推理以及部署等各个环节,为用户提供端到端的大模型方案和一站式的开发体验;内置[4D并行分布式Trainer](./docs/trainer.md)[高效微调算法LoRA/Prefix Tuning](./llm#33-lora), [自研INT8/INT4量化算法](./llm#6-量化)等等;全面支持[LLaMA 1/2](./llm/config/llama), [BLOOM](./llm/config/bloom), [ChatGLM 1/2](./llm/config/chatglm), [OPT](./llm/config/opt)等主流大模型


## 特性
Expand All @@ -63,22 +63,41 @@ Unified Checkpoint大模型存储格式在模型参数分布上支持动态扩

## 模型支持

| Model | Pretrain | SFT | LoRA | Prefix Tuning | DPO | RLHF | Quantization | Weight convert |
|--------------------------------------------|:--------:|:---:|:----:|:-------------:|:---:|:----:|:------------:|:--------------:|
| [LLaMA](./llm/config/llama) |||||||||
| [Qwen](./llm/config/qwen) |||||| 🚧 | 🚧 ||
| [Mixtral](./llm/config/mixtral) ||||| 🚧 | 🚧 | 🚧 | 🚧 |
| [Baichuan/Baichuan2](./llm/config/llama) |||||| 🚧 |||
| [ChatGLM-6B](./llm/config/chatglm) ||||| 🚧 | 🚧 |||
| [ChatGLM2/ChatGLM3](./llm/config/chatglm2) ||||| 🚧 | 🚧 |||
| [Bloom](./llm/config/bloom) ||||| 🚧 | 🚧 |||
| [GPT-3](./llm/config/gpt-3) ||| 🚧 | 🚧 | 🚧 | 🚧 | 🚧 ||
| [OPT](./llm/config/opt) | 🚧 ||| 🚧 | 🚧 | 🚧 | 🚧 ||
| Model | Pretrain | SFT | LoRA | Prefix Tuning | DPO | RLHF | Quantization | Weight convert |
|---------------------------------------------|:--------:|:---:|:----:|:-------------:|:---:|:----:|:------------:|:--------------:|
| [LLaMA](./llm/config/llama) |||||||||
| [Qwen](./llm/config/qwen) |||||| 🚧 | 🚧 ||
| [Mixtral](./llm/config/mixtral) ||||| 🚧 | 🚧 | 🚧 | 🚧 |
| [Baichuan/Baichuan2](./llm/config/baichuan) |||||| 🚧 |||
| [ChatGLM-6B](./llm/config/chatglm) ||||| 🚧 | 🚧 |||
| [ChatGLM2/ChatGLM3](./llm/config/chatglm2) ||||| 🚧 | 🚧 |||
| [Bloom](./llm/config/bloom) ||||| 🚧 | 🚧 |||
| [GPT-3](./llm/config/gpt-3) ||| 🚧 | 🚧 | 🚧 | 🚧 | 🚧 ||
| [OPT](./llm/config/opt) | 🚧 ||| 🚧 | 🚧 | 🚧 | 🚧 ||

* ✅: Supported
* 🚧: In Progress
* ❌: Not Supported

| 模型系列 | 模型参数 |
|:----------------------------------:|:----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| [LLaMA](./llm/config/llama) | facebook/llama-7b, facebook/llama-13b, facebook/llama-30b, facebook/llama-65b |
| [LLama2](./llm/config/llama) | meta-llama/Llama-2-7b, meta-llama/Llama-2-7b-chat, meta-llama/Llama-2-13b, meta-llama/Llama-2-13b-chat, meta-llama/Llama-2-70b, meta-llama/Llama-2-70b-chat |
| [LLama3](./llm/config/llama) | meta-llama/Meta-Llama-3-8B, meta-llama/Meta-Llama-3-8B-Instruct, meta-llama/Meta-Llama-3-70B, meta-llama/Meta-Llama-3-70B-Instruct |
| [Baichuan](./llm/config/baichuan) | baichuan-inc/Baichuan-7B, baichuan-inc/Baichuan-13B-Base, baichuan-inc/Baichuan-13B-Chat |
| [Baichuan2](./llm/config/baichuan) | baichuan-inc/Baichuan2-7B-Base, baichuan-inc/Baichuan2-7B-Chat, baichuan-inc/Baichuan2-13B-Base, baichuan-inc/Baichuan2-13B-Chat |
| [Bloom](./llm/config/bloom) | bigscience/bloom-560m, bigscience/bloom-560m-bf16, bigscience/bloom-1b1, bigscience/bloom-3b, bigscience/bloom-7b1, bigscience/bloomz-560m, bigscience/bloomz-1b1, bigscience/bloomz-3b, bigscience/bloomz-7b1-mt, bigscience/bloomz-7b1-p3, bigscience/bloomz-7b1, bellegroup/belle-7b-2m |
| [ChatGLM](./llm/config/chatglm/) | THUDM/chatglm-6b, THUDM/chatglm-6b-v1.1 |
| [ChatGLM2](./llm/config/chatglm2) | THUDM/chatglm2-6b |
| [ChatGLM3](./llm/config/chatglm2) | THUDM/chatglm3-6b |
| [Gemma](./llm/config/gemma) | google/gemma-7b, google/gemma-7b-it, google/gemma-2b, google/gemma-2b-it |
| [Mistral](./llm/config/mistral) | mistralai/Mistral-7B-Instruct-v0.3, mistralai/Mistral-7B-v0.1 |
| [Mixtral](./llm/config/mixtral) | mistralai/Mixtral-8x7B-Instruct-v0.1 |
| [OPT](./llm/config/opt) | facebook/opt-125m, facebook/opt-350m, facebook/opt-1.3b, facebook/opt-2.7b, facebook/opt-6.7b, facebook/opt-13b, facebook/opt-30b, facebook/opt-66b, facebook/opt-iml-1.3b, opt-iml-max-1.3b |
| [Qwen](./llm/config/qwen/) | qwen/qwen-7b, qwen/qwen-7b-chat, qwen/qwen-14b, qwen/qwen-14b-chat, qwen/qwen-72b, qwen/qwen-72b-chat, |
| [Qwen1.5](./llm/config/qwen/) | Qwen/Qwen1.5-0.5B, Qwen/Qwen1.5-0.5B-Chat, Qwen/Qwen1.5-1.8B, Qwen/Qwen1.5-1.8B-Chat, Qwen/Qwen1.5-4B, Qwen/Qwen1.5-4B-Chat, Qwen/Qwen1.5-7B, Qwen/Qwen1.5-7B-Chat, Qwen/Qwen1.5-14B, Qwen/Qwen1.5-14B-Chat, Qwen/Qwen1.5-32B, Qwen/Qwen1.5-32B-Chat, Qwen/Qwen1.5-72B, Qwen/Qwen1.5-72B-Chat, Qwen/Qwen1.5-110B, Qwen/Qwen1.5-110B-Chat, Qwen/Qwen1.5-MoE-A2.7B, Qwen/Qwen1.5-MoE-A2.7B-Chat |
| [Qwen2](./llm/config/qwen/) | Qwen/Qwen2-0.5B, Qwen/Qwen2-0.5B-Instruct, Qwen/Qwen2-1.5B, Qwen/Qwen2-1.5B-Instruct, Qwen/Qwen2-7B, Qwen/Qwen2-7B-Instruct, Qwen/Qwen2-72B, Qwen/Qwen2-72B-Instruct, Qwen/Qwen2-57B-A14B, Qwen/Qwen2-57B-A14B-Instruct |

详细列表👉[模型参数支持](https://github.com/PaddlePaddle/PaddleNLP/issues/8663)

------------------------------------------------------------------------------------------
Expand Down
Loading

0 comments on commit e773524

Please sign in to comment.