Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Docs] Fix A100 Support for DeepSeek-R1 671B: Clarify FP8 Issue & YAML Inconsistencies #4726

Merged
merged 12 commits into from
Feb 18, 2025

Conversation

andylizf
Copy link
Collaborator

@andylizf andylizf commented Feb 16, 2025

Resolves #4723

A100 does not support FP8, but the current example incorrectly lists A100 as a valid accelerator.

This PR removes A100 from the default accelerator list and introduces a dedicated job YAML for running DeepSeek-R1 on A100 with BF16 conversion. It also adds a documentation section explaining how to use A100 properly.

Additionally, this PR addresses several documentation inconsistencies and improves wording for clarity.

@andylizf andylizf requested a review from Michaelvll February 16, 2025 08:26
Copy link
Collaborator

@Michaelvll Michaelvll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @andylizf for updating this ! It looks great. Left some comments. Could formatting the output of the model in the details as well, ie make those \n actual new lines, like we did in the distilled model example readme?

Copy link
Collaborator

@Michaelvll Michaelvll left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @andylizf

andylizf and others added 2 commits February 17, 2025 01:50
@Michaelvll Michaelvll merged commit 156da6c into master Feb 18, 2025
18 checks passed
@Michaelvll Michaelvll deleted the fix-r1-docs branch February 18, 2025 18:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Examples] A100 listed for DeepSeek-R1 671B, but it doesn’t support FP8 (default in SGLang)
2 participants