Skip to content

Commit

Permalink
Add FP8 numbers for the 13B
Browse files Browse the repository at this point in the history
  • Loading branch information
epwalsh committed Oct 23, 2024
1 parent 425f7db commit 310866e
Showing 1 changed file with 4 additions and 4 deletions.
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,14 +35,14 @@ Throughput numbers from these scripts with various different configuration setti
| Model size | Context length | Precision | Throughput[^1] | Training script | Commandline overrides                                    |
| :--------: | :------------: | :-------: | -----------: | :----------- | :-------- |
| **1B** | 4096 | BF16 | 44,000 TPS | `OLMo-1B.py` | |
| | 256-8192[^2] | BF16 | 49,000 TPS | `OLMo-1B.py` | `--dataset.name=vsl` |
| | 4096 | FP8 | 51,000 TPS | `OLMo-1B.py` | `--model.float8_config.enabled=true` |
| | 4096 | BF16/FP8[^2] | 51,000 TPS | `OLMo-1B.py` | `--model.float8_config.enabled=true` |
| **7B** | 4096 | BF16 | 10,000 TPS | `OLMo-7B.py` | |
| | | FP8 | 13,000 TPS | `OLMo-7B.py` | `--model.float8_config.enabled=true` |
| | 4096 | BF16/FP8[^2] | 13,000 TPS | `OLMo-7B.py` | `--model.float8_config.enabled=true` |
| **13B** | 4096 | BF16 | 4,600 TPS | `OLMo-13B.py` | |
| | 4096 | BF16/FP8[^2] | 5,500 TPS | `OLMo-13B.py` | `--model.float8_config.enabled=true` |

[^1]: Throughput reported in tokens per second per device.
[^2]: Denotes variable sequence length (VSL) with the Grow-P2 curriculum from [Dataset Decomposition: Faster LLM Training with Variable Sequence Length Curriculum](https://arxiv.org/abs/2405.13226).
[^2]: In this setup most GEMMs are computed in `float8`, everything else is in `bfloat16`.

## Development

Expand Down

0 comments on commit 310866e

Please sign in to comment.