Skip to content

Release 0.4.1

Compare
Choose a tag to compare
@jhen0409 jhen0409 released this 17 Nov 06:12
· 10 commits to main since this release

0.4.1 (2024-11-17)

Bug Fixes

  • cpp, ios: add prefix for iq2/iq3 func for avoid redef from another libs used ggml (8ce5216)
  • cpp: validateModelChatTemplate missing messages check (c1d15a3)
  • ts: remove file:// prefix for lora param (da3e9a7)

Features

  • add static method for read model info from gguf (#87) (8bf9dd8)
  • cpp: remove unused json and json-schema-to-grammar in common (#90) (0a37cda)
  • expose flash_attn / cache_type_k / cache_type_v (4ce8ff8)
  • no longer need disable mmap for lora (38fa660)
  • sync llama.cpp (#89) (5e1b30a)
  • ts: expose template for getFormattedChat (c473995), closes #84
  • update embedding method (#88) (6190f57)