-
Notifications
You must be signed in to change notification settings - Fork 120
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
6ed19a1
commit e394c07
Showing
1 changed file
with
58 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,58 @@ | ||
[ [Back to the common setup](README.md) ] | ||
|
||
## Build Nvidia Docker Container (from 3.1 Inference round) | ||
|
||
``` | ||
cm docker script --tags=build,nvidia,inference,server | ||
``` | ||
## Run this benchmark via CM | ||
|
||
|
||
### Do a test run to detect and record the system performance | ||
|
||
``` | ||
cmr "generate-run-cmds inference _find-performance _all-scenarios" \ | ||
--model=gptj-99 --implementation=nvidia-original --device=cuda --backend=tensorrt \ | ||
--category=edge --division=open --quiet | ||
``` | ||
* Use `--division=closed` to run all scenarios for the closed division. | ||
* Use `--category=datacenter` to run datacenter scenarios | ||
|
||
### Do full accuracy and performance runs for all the scenarios | ||
|
||
``` | ||
cmr "generate-run-cmds inference _submission _all-scenarios" --model=gptj-99 \ | ||
--device=cuda --implementation=nvidia-original --backend=tensorrt \ | ||
--execution-mode=valid \ | ||
--category=edge --division=open --quiet --skip_submission_generation=yes | ||
``` | ||
|
||
* Use `--power=yes` for measuring power. It is ignored for accuracy and compliance runs | ||
* Use `--division=closed` to run all scenarios for the closed division. No compliance runs are there for gptj. | ||
* `--offline_target_qps`, `--server_target_qps`, and `--singlestream_target_latency` can be used to override the determined performance numbers | ||
|
||
### Populate the README files describing your submission | ||
|
||
``` | ||
cmr "generate-run-cmds inference _populate-readme _all-scenarios" \ | ||
--model=resnet50 --device=cuda --implementation=nvidia-original --backend=tensorrt \ | ||
--execution-mode=valid --results_dir=$HOME/results_dir \ | ||
--category=edge --division=open --quiet | ||
``` | ||
|
||
### Generate and upload MLPerf submission | ||
|
||
Follow [this guide](../Submission.md) to generate the submission tree and upload your results. | ||
|
||
### Run individual scenarios for testing and optimization | ||
|
||
TBD | ||
|
||
### Questions? Suggestions? | ||
|
||
Don't hesitate to get in touch via [public Discord server](https://discord.gg/JjWNWXKxwT). | ||
|
||
### Acknowledgments | ||
|
||
* CM automation for Nvidia's MLPerf inference implementation was developed by Arjun Suresh and Grigori Fursin. | ||
* Nvidia's MLPerf inference implementation was developed by Zhihan Jiang, Ethan Cheng, Yiheng Zhang and Jinho Suh. |