This repo is for benchmarking hardware and software
For results, see: https://docs.google.com/spreadsheets/d/1kT4or6b0Fedd-W_jMwYpb63e1ZR3aePczz3zlbJW-Y4/edit#gid=1652827441
For some my benchmarking writeups:
- 2025-01-01: Revisting llama.cpp speculative decoding w/ Qwen2.5-Coder 32B (AMD vs Nvidia results)
- 2024-12-17: Relative performance in llama.cpp when adjusting power limits for an RTX 3090 (w/ scripts)
- 2024-11-02: llama.cpp Compute and Memory Bandwidth Efficiency w/ Different Devices/Backends
- 2024-11-02: Testing llama.cpp with Intel's Xe2 iGPU (Core Ultra 7 258V w/ Arc Graphics 140V)
- 2024-10-24: Tuning for Efficient Inferencing with vLLM on MI300X
- 2024-06-19: Trainer performance comparison: torchtune vs. axolotl vs. Unsloth
- 2024-01-09: AMD Radeon 7900 XT/XTX Inference Performance Comparisons