#

rocm

Here are 139 public repositories matching this topic...

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

amd cuda inference pytorch transformer llama gpt rocm model-serving tpu hpu mlops xpu llm inferentia llmops llm-serving trainium

Updated Nov 22, 2024
Python

apache / tvm

Open deep learning compiler stack for cpu, gpu and specialized accelerators

javascript machine-learning performance deep-learning metal compiler gpu vulkan opencl tensor spirv rocm tvm

Updated Nov 22, 2024
Python

cupy / cupy

NumPy & SciPy for GPU

python gpu numpy cuda cublas scipy tensor cudnn rocm cupy cusolver nccl curand cusparse nvrtc cutensor nvtx cusparselt

Updated Nov 22, 2024
Python

lshqqytiger / stable-diffusion-webui-amdgpu

Stable Diffusion web UI

web ai deep-learning amd torch image-generation hip amdgpu rocm radeon text2image image2image img2img ai-art directml txt2img stable-diffusion

Updated Nov 14, 2024
Python

dmlc / nnvm

deep-learning deployment metal optimization opencl cuda computation-graph rocm nnvm tvm

Updated Sep 11, 2018
C++

deepmd-kit

deepmodeling / deepmd-kit

A deep learning package for many-body potential energy representation and molecular dynamics

nodejs python c deep-learning cpp tensorflow cuda molecular-dynamics pytorch computational-chemistry lammps materials-science ipi rocm ase jax potential-energy deepmd

Updated Nov 22, 2024
C++

stdgpu

stotko / stdgpu

stdgpu: Efficient STL-like Data Structures on the GPU

cpp gpu modern-cpp openmp cuda stl data-structures gpgpu gpu-acceleration cpp17 stl-containers hip gpu-computing rocm cpp20 stl-like

Updated Nov 20, 2024
C++

PygmalionAI / aphrodite-engine

Large-scale LLM inference engine

machine-learning cuda intel api-rest lora rocm inference-engine tpu inferentia speculative-decoding

Updated Nov 22, 2024
Python

ROCm / ROCm-docker

Dockerfiles for the various software layers defined in the ROCm software platform

Updated Aug 21, 2024
Shell

alpaka-group / alpaka

Abstraction Library for Parallel Kernel Acceleration 🦙

cpp hpc gpu openmp cuda header-only cpp17 hip heterogeneous-parallel-programming tbb openacc rocm

Updated Nov 22, 2024
C++

ROCm / rocBLAS

Next generation BLAS implementation for ROCm platform

Updated Nov 21, 2024
C++

agenium-scale / nsimd

Agenium Scale vectorization library for CPUs and GPUs

hpc neon cuda avx simd avx2 sse2 simd-programming aarch64 avx512 simd-instructions simd-library sse42 rocm cpp20 sve neon128 cpp20-library vectorization-library

Updated Oct 21, 2021
C

JuliaGPU / AMDGPU.jl

AMD GPU (ROCm) programming in Julia

julia amdgpu rocm

Updated Nov 21, 2024
Julia

ROCm / k8s-device-plugin

Kubernetes (k8s) device plugin to enable registration of AMD GPU to a container cluster

kubernetes k8s rocm kubernetes-device-plugins

Updated Nov 21, 2024
Go

ROCm / pytorch

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Updated Nov 22, 2024
Python

LLNL / hiop

HPC solver for nonlinear optimization problems

Updated Nov 22, 2024
C++

ROCm / aomp

AOMP is an open source Clang/LLVM based compiler with added support for the OpenMP® API on Radeon™ GPUs. Use this repository for releases, issues, documentation, packaging, and examples.

amd llvm openmp clang fortran-compiler rocm

Updated Nov 21, 2024
Fortran

COSMA

eth-cscs / COSMA

Distributed Communication-Optimal Matrix-Matrix Multiplication Algorithm

linear-algebra mpi cuda scalapack matrix-multiplication gpu-acceleration rocm matmul communication-optimal pdgemm

Updated Nov 6, 2024
C++

MIVisionX

ROCm / MIVisionX

MIVisionX toolkit is a set of comprehensive computer vision and machine intelligence libraries, utilities, and applications bundled into a single toolkit. AMD MIVisionX also delivers a highly optimized open-source implementation of the Khronos OpenVX™ and OpenVX™ Extensions.

Updated Nov 21, 2024
C++

supranational / sppark

Zero-knowledge template library

cuda rocm zero-knowledge zk-snarks ntt zk-starks zero-knowledge-proofs bls12-381 bls12-377 pasta-curves

Updated Nov 5, 2024
Cuda

Improve this page

Add a description, image, and links to the rocm topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the rocm topic, visit your repo's landing page and select "manage topics."