Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR contains the following updates:
1.38.0
->1.40.0
1.44.0
->1.47.0
0.8.0
->0.8.1
0.12.5
->0.12.7
0.10
->0.13
0.10
->0.13
1.0.208
->1.0.209
1.0.122
->1.0.127
1.39.2
->1.40.0
2.1.6
->2.2.0
Release Notes
robertknight/ocrs (ocrs)
v0.8.1
Compare Source
Added ability to customize the alphabet used by the recognition model
https://github.com/robertknight/ocrs/pull/100/100). Thanks @Phaired.
Updated rten to v0.13.1. This enables running custom models in the V2
rten model format
seanmonstar/reqwest (reqwest)
v0.12.7
Compare Source
impl Service<http::Request<_>>
forClient
.v0.12.6
Compare Source
danger_accept_invalid_hostnames
forrustls
.impl Service<http::Request<Body>>
forClient
and&'_ Client
.!Sync
bodies inBody::wrap_stream()
.hickory-dns
is used.Proxy
so thatHTTP(S)_PROXY
values take precendence overALL_PROXY
.blocking::RequestBuilder::header()
from unsettingsensitive
on passed header values.robertknight/rten (rten)
v0.13.1
Compare Source
rten
New features
Added speech detection example using Silero VAD
https://github.com/robertknight/rten/pull/338/338)
Support int tensors in ArgMin and ArgMax ops
https://github.com/robertknight/rten/pull/329/329)
Support "reflect" padding mode (https://github.com/robertknight/rten/pull/326)
Bug fixes
Fixed panic with certain combinations of input, kernel size and padding in
depthwise convolutihttps://github.com/robertknight/rten/pull/336/336)
Fixed attempted out-of-bounds slice in depthwise convolution when input tensor
has a row stride that exceeds the row lenghttps://github.com/robertknight/rten/pull/335pull/335)
Fixed conversion of
auto_pad
attribute for Conv operatorhttps://github.com/robertknight/rten/pull/333/333)
Round timings to microseconds in verbose log
https://github.com/robertknight/rten/pull/331/331)
Fixed panic when slicing empty tensors
https://github.com/robertknight/rten/pull/325/325)
Fixed 1D convolution failing with non-contiguous inputs
https://github.com/robertknight/rten/pull/324/324)
Fixed conversion of shape information for scalar tensors
https://github.com/robertknight/rten/pull/323/323)
Fixed panic in softmax if the size of the normalized axis is zero
https://github.com/robertknight/rten/pull/322/322)
rten-cli
--mmap
flag to load model using memory mapping instead of readingwhole file into a buffhttps://github.com/robertknight/rten/pull/330/330)
v0.13.0
Compare Source
This release adds the infrastructure to support subgraphs, which are used in
control flow operators like
If
, plus an implementation of theIf
operatorand a TrOCR example which uses it.
rten
Added
Model::load_static_slice
API which can be used to load models embedded in the binary with
include_bytes!
. Thanks @hsfzxjy.Added TrOCR example (https://github.com/robertknight/rten/pull/304)
Support
If
operator (https://github.com/robertknight/rten/pull/306)Added full support for
Einsum
operator (https://github.com/robertknight/rten/pull/297,https://github.com/robertknight/rten/pull/299/2https://github.com/robertknight/rten/pull/300/3https://github.com/robertknight/rten/pull/302puhttps://github.com/robertknight/rten/pull/303pull/303)
rten-cli
Added
--quiet
flag (https://github.com/robertknight/rten/pull/313)Inputs named
use_cache_branch
now get a default value of0
(ddf4109
)rten-generate
Support models with cross-attention KV caches that are computed on the first
run of the decodhttps://github.com/robertknight/rten/pull/318/318). This
is used by Hugging Face models for encoder-decoder systems.
Support models without a KV cache (https://github.com/robertknight/rten/pull/305)
rten-tensor
Tensor::remove_axis
(b823d46
)Tensor::from_storage_and_layout
(54d2941
)rten-text
vocabulary which are never generated by merges and are not added special
tokens (
18e9b2a
)v0.12.0
Compare Source
rten
Breaking changes
The
rten-convert
tool now generates models in the V2 format by defaulthttps://github.com/robertknight/rten/pull/272/272).
These models can only be loaded by RTen version 0.11.0 or later. The V1
format can be generated by specifying the
--v1
flag. Therten
crate canload both V1 and V2 format models.
See the
.rten
file format documentationfor more details.
The
reduce_{max, min, sum}
tensor methods have moved from theFloatOperators
trait to theOperators
trait (https://github.com/robertknight/rten/pull/274).Examples and documentation
Added Segment Anything example (https://github.com/robertknight/rten/pull/295).
This supports the original SAM models plus several derivatives with
lighter-weight image encoders.
Added chatbot example using Qwen2 (https://github.com/robertknight/rten/pull/282).
This also works with SmolLM.
Model::load_mmap
docs now have a better explanation of the memory andperformance impact (
ce0b717
)New features
Einsum
operator (https://github.com/robertknight/rten/pull/295).Performance improvements
Avoid allocations in most cases when broadcasting tensor shapes (
c4b5f26
).Strides of size-1 dimensions are ignored when determining whether a tensor is
contiguohttps://github.com/robertknight/rten/pull/292/292). This allows more
operations to use fast paths for contiguous tensors.
Optimized
LayerNormalization
andReduceMean
(https://github.com/robertknight/rten/pull/291)Added fast-path for
Resize
operator when input scale is 1 (https://github.com/robertknight/rten/pull/290)Return input buffer to pool in
Cast
operator if input needs to be copiedhttps://github.com/robertknight/rten/pull/289/289).
Implemented
LayerNormalization
fusion (https://github.com/robertknight/rten/pull/280)Implemented
GELU
fusion (https://github.com/robertknight/rten/pull/277)rten-cli
*_ids
now use zero as theauto-generated input value (
78cd621
)rten-generate
TopKSampler
now supports specifying a temperature (65b837b
)Added
Generator::append_prompt
to append to prompt after initial generation.This is useful for chat-like applications (
5ef3cb2
)Fixed an issue where
attention_mask
input had the wrong size (cae6134
)rten-tensor
Breaking changes
tensor
andndtensor
macros have been deprecated in favor ofTensor::from
andNdTensor::from
(https://github.com/robertknight/rten/pull/286).Other changes
Tensor::from
now supports creating tensors from scalar values (d2ca876
)Tensor::lanes
iterator performance was improved by making them exact-sizedand fused (
9e31556
)rten-text
Token IDs are now represented as
u32
rather thanusize
, for consistencywith rten-generahttps://github.com/robertknight/rten/pull/288/288).
The
vocab
mapping intokenizer.json
files is now used to determine tokenIDs when decodihttps://github.com/robertknight/rten/pull/287/287).
v0.11.1
Compare Source
rten
Instant::now
https://github.com/robertknight/rten/pull/283/283).
v0.11.0
Compare Source
rten
Breaking changes
The
inputs
argument toModel::run
now accepts aVec<(NodeId, InputOrOutput)>
instead of&[(NodeId, Input)]
, whereInputOrOutput
is anenum that is either an owned
Tensor
or aTensorView
. This enables passingownership of an input to
Model::run
, which is in turn enables efficientin-place updates to cache-like inputs.
The
InputOrOutput
type implementsFrom
for tensors and tensor views, socode such as:
Becomes:
New features
Add a new version of the
.rten
file format which supports models over 2GBin size. The
rten-convert
tool still generates V1 models by default butwill generate the V2 format if the
--v2
flag is providedhttps://github.com/robertknight/rten/pull/260/260).
Support
Gelu
operator (https://github.com/robertknight/rten/pull/248)Bug fixes
Prevent
Model::partial_run
from propagating values through randomizedoperatohttps://github.com/robertknight/rten/pull/240/240).
Improved accuracy of timing metrics and eliminated unaccounted for
("[Other]") https://github.com/robertknight/rten/pull/254l/254.
Performance improvements
This release adds a new graph optimization step as part of loading models. This
performs fusions and other optimizations to speed up inference. These
optimizations are enabled by default, but can be disabled via options in
ModelOptions
.Improved parallelism in the
Softmax
operator (https://github.com/robertknight/rten/pull/258)Made
Tensor::inner_iter
faster (https://github.com/robertknight/rten/pull/259)Made
Gather
,Concat
andUnsqueeze
operators faster for small inputs.These operations are common in subgraphs that operator on tensor shaphttps://github.com/robertknight/rten/pull/255puhttps://github.com/robertknight/rten/pull/256tehttps://github.com/robertknight/rten/pull/257ht/rten/pull/257.
Optimized vector-matrix multiplication (https://github.com/robertknight/rten/pull/250,
https://github.com/robertknight/rten/pull/253/253). This benefits transformer
decoder inference when the batch size is 1.
Fuse
Mul(X, Sigmoid(X))
subgraphs into aSilu
operation. This speeds upYOLOv8 by 8%. https://github.com/robertknight/rten/pull/246/246.
Further reduce small allocations during graph execution
https://github.com/robertknight/rten/pull/243/2https://github.com/robertknight/rten/pull/245pull/245).
Fuse
MatMul(Transpose(X), Y)
subgraphs to avoid materializing the transposedmatrhttps://github.com/robertknight/rten/pull/242/242).
Perform constant propagation when loading models
https://github.com/robertknight/rten/pull/241/241).
Enabled
Concat
operator to run in-place if the caller has specificallyreserved space in the first input's buffhttps://github.com/robertknight/rten/pull/239pull/239).
Cache the last-used execution plan. This avoids recomputing the sequence of
execution steps when a model is run in a lohttps://github.com/robertknight/rten/pull/234pull/234).
Improved performance of unary operators for non-contiguous inputs
https://github.com/robertknight/rten/pull/223/223)
Optimized
Where
operator for non-contiguous inputshttps://github.com/robertknight/rten/pull/213/213)
Optimized variadic operators (https://github.com/robertknight/rten/pull/212)
Optimized
Pow
operator (https://github.com/robertknight/rten/pull/219)rten-examples
rten-generate
This is a new crate which provides a convenient
Iterator
-based interface forrunning auto-regressive decoder models. See the
gpt2
anddistilvit
examplesin the
rten-examples
crate for code samples.rten-tensor
NdTensor::from
https://github.com/robertknight/rten/pull/226/226).
rten-text
serde-rs/serde (serde)
v1.0.209
Compare Source
serde-rs/json (serde_json)
v1.0.127
Compare Source
v1.0.126
Compare Source
v1.0.125
Compare Source
v1.0.124
Compare Source
v1.0.123
Compare Source
tokio-rs/tokio (tokio)
v1.40.0
: Tokio v1.40.0Compare Source
1.40.0 (August 30th, 2024)
Added
util::SimplexStream
(#6589)Command::process_group
(#6731){TrySendError,SendTimeoutError}::into_inner
(#6755)JoinSet::join_all
(#6784)Added (unstable)
Builder::{on_task_spawn, on_task_terminate}
(#6742)Changed
write_all_buf
when possible (#6724)UnwindSafe
(#6783)Sleep
andBatchSemaphore
instrumentation explicit roots (#6727)NonZeroU64
fortask::Id
(#6733)JoinError
(#6753)#[must_use]
toJoinHandle::abort_handle
(#6762)Documented
[build]
section doesn't go in Cargo.toml (#6728)select!
(#6774)v1.39.3
: Tokio v1.39.3Compare Source
1.39.3 (August 17th, 2024)
This release fixes a regression where the unix socket api stopped accepting the abstract socket namespace. (#6772)
zip-rs/zip2 (zip)
v2.2.0
Compare Source
🚀 Features
ZipArchive::central_directory_start
(#232)Configuration
📅 Schedule: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).
🚦 Automerge: Disabled by config. Please merge this manually once you are satisfied.
♻ Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.
👻 Immortal: This PR will be recreated if closed unmerged. Get config help if that's undesired.
This PR was generated by Mend Renovate. View the repository job log.