Skip to content

Actions: EricLBuehler/mistral.rs

Analysis

Actions

Loading...
Loading

Show workflow options

Create status badge

Loading
1,629 workflow runs
1,629 workflow runs

Filter by Event

Filter by Status

Filter by Branch

Filter by Actor

Support fp8 on Metal
Analysis #1629: Pull request #930 opened by EricLBuehler
November 24, 2024 10:22 1m 14s
November 24, 2024 10:22 1m 14s
VLlama vision model ISQ support
Analysis #1628: Pull request #928 synchronize by EricLBuehler
November 23, 2024 11:00 1m 2s
November 23, 2024 11:00 1m 2s
VLlama vision model ISQ support
Analysis #1627: Pull request #928 opened by EricLBuehler
November 23, 2024 03:38 1m 7s
November 23, 2024 03:38 1m 7s
Default to SDPA for faster VLlama PP T/s
Analysis #1626: Pull request #927 synchronize by EricLBuehler
November 23, 2024 03:33 59s
November 23, 2024 03:33 59s
Default to SDPA for faster VLlama PP T/s
Analysis #1625: Pull request #927 opened by EricLBuehler
November 22, 2024 23:41 59s
November 22, 2024 23:41 59s
Paged Attention alibi support
Analysis #1624: Pull request #926 opened by EricLBuehler
November 22, 2024 13:43 1m 2s
November 22, 2024 13:43 1m 2s
Faster CUDA prompt speeds
Analysis #1623: Pull request #925 opened by EricLBuehler
November 21, 2024 20:10 17m 33s
November 21, 2024 20:10 17m 33s
Faster CUDA prompt speeds
Analysis #1622: Pull request #924 opened by EricLBuehler
November 21, 2024 20:09 12m 36s
November 21, 2024 20:09 12m 36s
Expand attnmask on cuda
Analysis #1621: Pull request #923 opened by EricLBuehler
November 21, 2024 18:45 1m 12s
November 21, 2024 18:45 1m 12s
add serde serialization to text chat types
Analysis #1620: Pull request #921 synchronize by rozgo
November 20, 2024 23:30 59s
November 20, 2024 23:30 59s
add serde serialization to text chat types
Analysis #1619: Pull request #921 opened by rozgo
November 20, 2024 20:35 1m 1s
November 20, 2024 20:35 1m 1s
Dont always compile with fp8, bf16 for cuda
Analysis #1618: Pull request #920 opened by EricLBuehler
November 20, 2024 19:36 1m 11s
November 20, 2024 19:36 1m 11s
Experiment with vendoring the Candle fork
Analysis #1617: Pull request #919 opened by EricLBuehler
November 20, 2024 00:24 1m 0s
November 20, 2024 00:24 1m 0s
fix(docker): workaround for missing nvidia-smi, bump cuda to current
Analysis #1616: Pull request #913 synchronize by pull bot
November 19, 2024 00:45 1m 16s
November 19, 2024 00:45 1m 16s
Fixes for kv cache grow
Analysis #1615: Pull request #917 opened by EricLBuehler
November 18, 2024 22:12 1m 2s
November 18, 2024 22:12 1m 2s
Preallocated KV cache
Analysis #1614: Pull request #916 synchronize by EricLBuehler
November 18, 2024 18:49 1m 8s
November 18, 2024 18:49 1m 8s
Preallocated KV cache
Analysis #1613: Pull request #916 synchronize by EricLBuehler
November 18, 2024 18:31 1m 6s
November 18, 2024 18:31 1m 6s
Preallocated KV cache
Analysis #1612: Pull request #916 synchronize by EricLBuehler
November 18, 2024 14:45 1m 1s
November 18, 2024 14:45 1m 1s
Preallocated KV cache
Analysis #1611: Pull request #916 synchronize by EricLBuehler
November 18, 2024 14:44 1m 13s
November 18, 2024 14:44 1m 13s
Preallocated KV cache
Analysis #1610: Pull request #916 opened by EricLBuehler
November 18, 2024 14:09 1m 8s
November 18, 2024 14:09 1m 8s
Metal: Use mtl resource shared to avoid one copy
Analysis #1609: Pull request #914 opened by EricLBuehler
November 17, 2024 01:59 1m 7s
November 17, 2024 01:59 1m 7s
fix(docker): workaround for missing nvidia-smi, bump cuda to current
Analysis #1608: Pull request #913 opened by sammcj
November 16, 2024 22:04 1m 1s
November 16, 2024 22:04 1m 1s
Support --dtype in mistralrs bench
Analysis #1607: Pull request #911 opened by EricLBuehler
November 14, 2024 23:44 59s
November 14, 2024 23:44 59s
Metal qmatmul mat-mat product (5.4x performance increase)
Analysis #1606: Pull request #909 synchronize by EricLBuehler
November 14, 2024 16:43 1m 13s
November 14, 2024 16:43 1m 13s
Attention-fused softmax for Metal
Analysis #1605: Pull request #908 synchronize by EricLBuehler
November 14, 2024 16:39 1m 1s
November 14, 2024 16:39 1m 1s