gh-pages fix

argonne-lcf · Sep 9, 2024 · 41c3064 · 41c3064
1 parent 47d0f04
commit 41c3064
Show file tree

Hide file tree

Showing 58 changed files with 8 additions and 3,277 deletions.
diff --git a/.gitignore b/.gitignore
diff --git a/Deepspeed-MII/README.md b/Deepspeed-MII/README.md
diff --git a/InferenceGraphPlotter/README.md b/InferenceGraphPlotter/README.md
diff --git a/README.md b/README.md
@@ -1,29 +1,8 @@
-# LLaMA-Inference-Bench
-
-LLM-Inference-Bench: Inference Benchmarking of Large Language Models on AI Accelerators
-
-## Metrix of Evaluated Frameworks and Hardwares :
-
-| Framework/ Hardware | NVIDIA A100 | NVIDIA H100 | NVIDIA GH200 | AMD MI250 | Intel PVC | Habana Gaudi2 | Sambanova SN40L |
-|:-----------------------:|:---------------:|:---------------:|:------------:|:---------:|:---------:|:-------------:|:---------------:|
-|         [vLLM](./vLLM/README.md)        |     [Link]()    |     [Link]()    |      Yes     |    [Link]()   |    [Link]()   |       No      |       N/A       |
-|      [llama.cpp](./llama.cpp/README.md)      |     [Link]()    |     [Link]()    |      Yes     |    [Link]()   |    [Link]()   |      N/A      |       N/A       |
-|     [TensorRT-LLM](./TensorRT-LLM/README.md)    |     [Link]()    |     [Link]()    |     [Link]()     |    N/A    |    N/A    |      N/A      |       N/A       |
-|      [DeepSpeed-MII](./Deepspeed-MII/README.md)      |      No     |      No     |      No      |     No    |     No    |      [Link]()     |       N/A       |
-
-## Key Insights 
-
-
- Cite this work:
- ```
- @INPROCEEDINGS{####,
-  author={Krishna Teja Chitty-Venkata and Siddhisanket Raskar and Bharat Kale and Farah Ferdaus and Aditya Tanikanti and Ken Raffenetti and Valerie Taylor and Murali Emani and Venkatram Vishwanath},
-  booktitle={2024 IEEE/ACM International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS)}, 
-  title={LLM-Inference-Bench: Inference Benchmarking of Large Language Models on AI Accelerators}, 
-  year={2024},
-  volume={},
-  number={},
-  pages={},
-  keywords={Large Language Models, AI Accelerators, Performance Evaluation, Benchmarking },
-  doi={}}
- ```
+# InferenceGraphPlotter
+
+## How to run?
+1. Clone the repo and cd into the repo
+2. Spin up a simple webserver to serve the files. One way is by using python.
+    - for python 2: python -m SimpleHTTPServer
+    - for python 3: python -m http.server
+3. Open a webbrowser and go to http://localhost:8000
diff --git a/TensorRT-LLM/A100/README.MD b/TensorRT-LLM/A100/README.MD
diff --git a/TensorRT-LLM/GH200/README.MD b/TensorRT-LLM/GH200/README.MD
diff --git a/TensorRT-LLM/H100/README.MD b/TensorRT-LLM/H100/README.MD
diff --git a/TensorRT-LLM/H100/p-llama2-7b.sh b/TensorRT-LLM/H100/p-llama2-7b.sh
diff --git a/TensorRT-LLM/H100/q-llama2-7b.sh b/TensorRT-LLM/H100/q-llama2-7b.sh