-
ModelCloud.ai
- Earth/Epoch 2.0
- https://modelcloud.ai
- @qubitium
Pinned Loading
-
ModelCloud/GPTQModel
ModelCloud/GPTQModel PublicProduction ready LLM model compression/quantization toolkit with accelerated inference support for both cpu/gpu via HF, vLLM, and SGLang.
-
ModelCloud/Device-SMI
ModelCloud/Device-SMI PublicSelf-contained Python lib with zero-dependencies that give you a unified device properties for gpu, cpu, and npu. No more calling separate tools such as nvidia-smi or /proc/cpuinfo and parsing it y…
Python 9
-
sgl-project/sglang
sgl-project/sglang PublicSGLang is a fast serving framework for large language models and vision language models.
-
vllm-project/vllm
vllm-project/vllm PublicA high-throughput and memory-efficient inference and serving engine for LLMs
-
flashinfer-ai/flashinfer
flashinfer-ai/flashinfer PublicFlashInfer: Kernel Library for LLM Serving
-
Dao-AILab/flash-attention
Dao-AILab/flash-attention PublicFast and memory-efficient exact attention
If the problem persists, check the GitHub status page or contact support.