Releases: daulet/tokenizers
Releases · daulet/tokenizers
v1.20.2
What's Changed
- feat: better error message when tokenizers lib mismatch by @daulet in #28
- feat: FromPretrained to load tokenizer directly from HF by @berkayersoyy in #27
New Contributors
- @berkayersoyy made their first contribution in #27
Full Changelog: v0.9.0...v1.20.2
v0.9.0
What's Changed
- feat: add option to retrieve offsets from tokenizer by @riccardopinosio in #21
- Update to huggingface/tokenizers v0.20.0 by @daulet in #23
New Contributors
- @riccardopinosio made their first contribution in #21
Full Changelog: v0.8.0...v0.9.0
v0.8.0
Breaking change:
Path to compiled rust library needs to be specified via -ldflags
. I found it most convenient to use CGO_LDFLAGS
env variable to avoid always setting it. See #18 for more details.
What's Changed
New Contributors
Full Changelog: v0.7.1...v0.8.0
v0.7.1
- Update core tokenizers library to latest: v0.15.2;
- Expose init time parameter to encode special tokens (or not);
Full Changelog: v0.7.0...v0.7.1
v0.7.0
What's Changed
- support more attributes from the Encoding structure by @clems4ever in #5
Full Changelog: v0.6.1...v0.7.0