-
Notifications
You must be signed in to change notification settings - Fork 416
Zr te doc edits #1745
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Zr te doc edits #1745
Conversation
The documentation is very clean. I just found a few minor style alignment issues. Signed-off-by: Zenobia "Z" Redeaux <[email protected]>
The text reads well. Just a note that our technical documentation style avoids ableist language (such as "see"), Latin terminology, and the use of "please." Also, if we tell the reader something "should" occur, we need to tell them what to do if it doesn't. Signed-off-by: Zenobia "Z" Redeaux <[email protected]>
Signed-off-by: Zenobia "Z" Redeaux <[email protected]>
Missed two verbs that should be in the past tense in the summary. Signed-off-by: Zenobia "Z" Redeaux <[email protected]>
Try to ensure section Head 1s have some sort of lead-in text. Signed-off-by: Zenobia "Z" Redeaux <[email protected]>
Signed-off-by: Zenobia "Z" Redeaux <[email protected]>
"\n", | ||
"<figure align=\"center\">\n", | ||
"<img src=\"delayed_scaling.png\" width=\"80%\">\n", | ||
"<figcaption> Figure 3: Delayed scaling strategy. The FP8 operator uses scaling factor obtained using the history of amaxes (maximums of absolute values) seen in some number of previous iterations and produces both the FP8 output and the current amax, which gets stored in the history.</figcaption>\n", | ||
"</figure>\n", | ||
"\n", | ||
"As one can see in Figure 3, delayed scaling strategy requires both storing the history of amaxes, but also choosing a recipe for converting that history into the scaling factor used in the next iteration." | ||
"In Figure 3, the delayed scaling strategy requires both storing the history of amaxes but also choosing a recipe for converting that history into the scaling factor used in the next iteration." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if the meaning here is exactly the same.
What we wanted to say here is that Figure 3 shows that the delayed scaling strategy requires those things in general. The changed sentence (at least the way I read it) suggests that those things are required only in the context of that figure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it should say "As shown in Figure 3"?
|
||
Installation (development build) | ||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
|
||
.. warning:: | ||
|
||
While the development build of Transformer Engine could contain new features not available in | ||
the official build yet, it is not supported and so its usage is not recommended for general | ||
the official build, those features are not supported, so their usage is not recommended for general |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We meant to say here that the development build of Transformer Engine is not supported, not that those features are not supported.
Description
Style alignment and minor typo edits to Transformer Engine documentation.
Fixes # (issue)
Type of change
Changes
Please list the changes introduced in this PR:
Checklist: