Skip to content
Change the repository type filter

All

    Repositories list

    • General Information, model certifications, and benchmarks for nm-vllm enterprise distributions
      1810Updated Dec 26, 2024Dec 26, 2024
    • Fast and memory-efficient exact attention
      C++
      BSD 3-Clause "New" or "Revised" License
      1.4k000Updated Dec 24, 2024Dec 24, 2024
    • vllm

      Public
      A high-throughput and memory-efficient inference and serving engine for LLMs
      Python
      Apache License 2.0
      5k7020Updated Dec 24, 2024Dec 24, 2024
    • A safetensors extension to efficiently store sparse quantized tensors on disk
      Python
      Apache License 2.0
      457210Updated Dec 23, 2024Dec 23, 2024
    • yolov5

      Public
      YOLOv5 in PyTorch > ONNX > CoreML > TFLite
      Python
      GNU General Public License v3.0
      17k2002Updated Dec 23, 2024Dec 23, 2024
    • 🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.
      Python
      Apache License 2.0
      27k101Updated Dec 23, 2024Dec 23, 2024
    • axolotl

      Public
      Go ahead and axolotl questions
      Python
      Apache License 2.0
      898001Updated Dec 20, 2024Dec 20, 2024
    • Neural Magic GHA
      Python
      Apache License 2.0
      0002Updated Dec 18, 2024Dec 18, 2024
    • guidellm

      Public
      Evaluate and Enhance Your LLM Deployments for Real-World Inference Needs
      Python
      Apache License 2.0
      14178109Updated Dec 11, 2024Dec 11, 2024
    • Benchmarking code for running quantized kernels from vLLM and other libraries
      Python
      0410Updated Dec 3, 2024Dec 3, 2024
    • A framework for few-shot evaluation of language models.
      Python
      MIT License
      2k301Updated Nov 27, 2024Nov 27, 2024
    • docs

      Public
      Top-level directory for documentation and general content
      MDX
      712004Updated Nov 25, 2024Nov 25, 2024
    • Fast and memory-efficient exact attention
      C++
      BSD 3-Clause "New" or "Revised" License
      1.4k000Updated Nov 23, 2024Nov 23, 2024
    • Python
      4000Updated Nov 21, 2024Nov 21, 2024
    • evalplus

      Public
      NeuralMagic fork of EvalPlus (Rigourous evaluation of LLM-synthesized code - NeurIPS 2023)
      Python
      Apache License 2.0
      114000Updated Nov 21, 2024Nov 21, 2024
    • graphs

      Public
      Apache License 2.0
      0000Updated Nov 15, 2024Nov 15, 2024
    • Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM
      Python
      Apache License 2.0
      66000Updated Nov 14, 2024Nov 14, 2024
    • An automatic evaluator for instruction-following language models. Human-validated, high-quality, cheap, and fast.
      Jupyter Notebook
      Apache License 2.0
      246000Updated Nov 12, 2024Nov 12, 2024
    • LLM training code for MosaicML foundation models
      Python
      Apache License 2.0
      534000Updated Oct 24, 2024Oct 24, 2024
    • nm-vllm

      Public archive
      A high-throughput and memory-efficient inference and serving engine for LLMs
      Python
      Other
      5k25400Updated Oct 11, 2024Oct 11, 2024
    • mteb

      Public
      MTEB: Massive Text Embedding Benchmark
      Jupyter Notebook
      Apache License 2.0
      289001Updated Oct 2, 2024Oct 2, 2024
    • 🤗Transformers: State-of-the-art Natural Language Processing for Pytorch and TensorFlow 2.0.
      Python
      Apache License 2.0
      27k9013Updated Oct 1, 2024Oct 1, 2024
    • AutoFP8

      Public
      Python
      Apache License 2.0
      23165103Updated Oct 1, 2024Oct 1, 2024
    • OmniQuant

      Public
      [ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.
      Python
      MIT License
      56001Updated Sep 27, 2024Sep 27, 2024
    • An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.
      Python
      MIT License
      491000Updated Sep 16, 2024Sep 16, 2024
    • Supercharge Your Model Training
      Python
      Apache License 2.0
      427000Updated Aug 27, 2024Aug 27, 2024
    • MixEval

      Public
      NM fork of MixEval compatible with SparseAutoModel.
      Python
      37001Updated Aug 20, 2024Aug 20, 2024
    • mamba

      Public
      Mamba SSM architecture
      Python
      Apache License 2.0
      1.2k000Updated Aug 12, 2024Aug 12, 2024
    • Causal depthwise conv1d in CUDA, with a PyTorch interface
      Cuda
      BSD 3-Clause "New" or "Revised" License
      69000Updated Aug 8, 2024Aug 8, 2024
    • sparseml

      Public
      Libraries for applying sparsification recipes to neural networks with a few lines of code, enabling faster and smaller models
      Python
      Apache License 2.0
      1482.1k760Updated Aug 1, 2024Aug 1, 2024