Create README_nvidia.md

mlcommons · Feb 15, 2024 · e394c07 · e394c07
1 parent 6ed19a1
commit e394c07
Showing 1 changed file with 58 additions and 0 deletions.
diff --git a/docs/mlperf/inference/gpt-j/README_nvidia.md b/docs/mlperf/inference/gpt-j/README_nvidia.md
@@ -0,0 +1,58 @@
+[ [Back to the common setup](README.md) ]
+
+## Build Nvidia Docker Container (from 3.1 Inference round)
+
+```
+cm docker script --tags=build,nvidia,inference,server
+```
+## Run this benchmark via CM
+
+
+### Do a test run to detect and record the system performance
+
+```
+cmr "generate-run-cmds inference _find-performance _all-scenarios" \
+--model=gptj-99 --implementation=nvidia-original --device=cuda --backend=tensorrt \
+--category=edge --division=open --quiet
+```
+* Use `--division=closed` to run all scenarios for the closed division.
+* Use `--category=datacenter` to run datacenter scenarios
+
+### Do full accuracy and performance runs for all the scenarios
+
+```
+cmr "generate-run-cmds inference _submission _all-scenarios" --model=gptj-99 \
+--device=cuda --implementation=nvidia-original --backend=tensorrt \
+--execution-mode=valid \
+--category=edge --division=open --quiet --skip_submission_generation=yes
+```
+
+* Use `--power=yes` for measuring power. It is ignored for accuracy and compliance runs
+* Use `--division=closed` to run all scenarios for the closed division. No compliance runs are there for gptj. 
+* `--offline_target_qps`, `--server_target_qps`, and `--singlestream_target_latency` can be used to override the determined performance numbers
+
+### Populate the README files describing your submission
+
+```
+cmr "generate-run-cmds inference _populate-readme _all-scenarios" \
+--model=resnet50 --device=cuda --implementation=nvidia-original --backend=tensorrt \
+--execution-mode=valid --results_dir=$HOME/results_dir \
+--category=edge --division=open --quiet
+```
+
+### Generate and upload MLPerf submission
+
+Follow [this guide](../Submission.md) to generate the submission tree and upload your results.
+
+### Run individual scenarios for testing and optimization
+
+TBD
+
+### Questions? Suggestions?
+
+Don't hesitate to get in touch via [public Discord server](https://discord.gg/JjWNWXKxwT).
+
+### Acknowledgments
+
+* CM automation for Nvidia's MLPerf inference implementation was developed by Arjun Suresh and Grigori Fursin.
+* Nvidia's MLPerf inference implementation was developed by Zhihan Jiang, Ethan Cheng, Yiheng Zhang and Jinho Suh.