Skip to content

Commit

Permalink
gh-pages fix
Browse files Browse the repository at this point in the history
  • Loading branch information
sraskar committed Sep 9, 2024
1 parent 47d0f04 commit 41c3064
Show file tree
Hide file tree
Showing 58 changed files with 8 additions and 3,277 deletions.
162 changes: 0 additions & 162 deletions .gitignore

This file was deleted.

2 changes: 0 additions & 2 deletions Deepspeed-MII/README.md

This file was deleted.

8 changes: 0 additions & 8 deletions InferenceGraphPlotter/README.md

This file was deleted.

37 changes: 8 additions & 29 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,29 +1,8 @@
# LLaMA-Inference-Bench

LLM-Inference-Bench: Inference Benchmarking of Large Language Models on AI Accelerators

## Metrix of Evaluated Frameworks and Hardwares :

| Framework/ Hardware | NVIDIA A100 | NVIDIA H100 | NVIDIA GH200 | AMD MI250 | Intel PVC | Habana Gaudi2 | Sambanova SN40L |
|:-----------------------:|:---------------:|:---------------:|:------------:|:---------:|:---------:|:-------------:|:---------------:|
| [vLLM](./vLLM/README.md) | [Link]() | [Link]() | Yes | [Link]() | [Link]() | No | N/A |
| [llama.cpp](./llama.cpp/README.md) | [Link]() | [Link]() | Yes | [Link]() | [Link]() | N/A | N/A |
| [TensorRT-LLM](./TensorRT-LLM/README.md) | [Link]() | [Link]() | [Link]() | N/A | N/A | N/A | N/A |
| [DeepSpeed-MII](./Deepspeed-MII/README.md) | No | No | No | No | No | [Link]() | N/A |

## Key Insights


Cite this work:
```
@INPROCEEDINGS{####,
author={Krishna Teja Chitty-Venkata and Siddhisanket Raskar and Bharat Kale and Farah Ferdaus and Aditya Tanikanti and Ken Raffenetti and Valerie Taylor and Murali Emani and Venkatram Vishwanath},
booktitle={2024 IEEE/ACM International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems (PMBS)},
title={LLM-Inference-Bench: Inference Benchmarking of Large Language Models on AI Accelerators},
year={2024},
volume={},
number={},
pages={},
keywords={Large Language Models, AI Accelerators, Performance Evaluation, Benchmarking },
doi={}}
```
# InferenceGraphPlotter

## How to run?
1. Clone the repo and cd into the repo
2. Spin up a simple webserver to serve the files. One way is by using python.
- for python 2: python -m SimpleHTTPServer
- for python 3: python -m http.server
3. Open a webbrowser and go to http://localhost:8000
Empty file removed TensorRT-LLM/A100/README.MD
Empty file.
Empty file removed TensorRT-LLM/GH200/README.MD
Empty file.
56 changes: 0 additions & 56 deletions TensorRT-LLM/H100/README.MD

This file was deleted.

32 changes: 0 additions & 32 deletions TensorRT-LLM/H100/p-llama2-7b.sh

This file was deleted.

37 changes: 0 additions & 37 deletions TensorRT-LLM/H100/q-llama2-7b.sh

This file was deleted.

Loading

0 comments on commit 41c3064

Please sign in to comment.