We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Goal: speed things up for StaticArrays without improving StaticArrays
Ideas:
reduce
map
mapreduce
stack(t)
hcat(t...)
t
NTuple
Related:
hessian
Benchmarks:
stack is better on Array but worse on SArray. The solution is to fix stack for SArray, at least in simple cases.
stack
Array
SArray
using BenchmarkTools, DataFrames, StaticArrays badstack(t) = stack(t); goodstack(t) = hcat(t...); badstack(f::F, t) where {F} = stack(f, t); goodstack(f::F, t) where {F} = hcat(map(f, t)...); tv = ntuple(i -> rand(1000), 10); tm = ntuple(i -> rand(100, 100), 10); tsv = ntuple(i -> @SVector(ones(4)), 10); tsm = ntuple(i -> @SMatrix(ones(4, 4)), 10); data_nofunction = DataFrame() data_function = DataFrame() for t in [tv, tm, tsv, tsm] @info "Benchmarking $(typeof(t))" # without function bad = @benchmark badstack($t) good = @benchmark goodstack($t) push!( data_nofunction, (; input_type=typeof(t), bad_time=minimum(bad.times), good_time=minimum(good.times), bad_alloc=minimum(bad.allocs), good_alloc=minimum(good.allocs), ), ) # with function bad = @benchmark badstack(vec, $t) good = @benchmark goodstack(vec, $t) push!( data_function, (; input_type=typeof(t), bad_time=minimum(bad.times), good_time=minimum(good.times), bad_alloc=minimum(bad.allocs), good_alloc=minimum(good.allocs), ), ) end
julia> data_nofunction 4×5 DataFrame Row │ input_type bad_time good_time bad_alloc good_alloc │ DataType Float64 Float64 Int64 Int64 ─────┼──────────────────────────────────────────────────────────────────────────────── 1 │ NTuple{10, Vector{Float64}} 4682.83 16404.0 2 2 2 │ NTuple{10, Matrix{Float64}} 43849.0 44823.0 2 2 3 │ NTuple{10, SVector{4, Float64}} 166.013 5.478 1 0 4 │ NTuple{10, SMatrix{4, 4, Float64… 258.992 27.8 1 0 julia> data_function 4×5 DataFrame Row │ input_type bad_time good_time bad_alloc good_alloc │ DataType Float64 Float64 Int64 Int64 ─────┼────────────────────────────────────────────────────────────────────────────────── 1 │ NTuple{10, Vector{Float64}} 5293.17 18105.0 2 2 2 │ NTuple{10, Matrix{Float64}} 45230.0 184490.0 22 22 3 │ NTuple{10, SVector{4, Float64}} 146.347 5.321 1 0 4 │ NTuple{10, SMatrix{4, 4, Float64… 226.041 27.4312 1 0
The text was updated successfully, but these errors were encountered:
No branches or pull requests
Goal: speed things up for StaticArrays without improving StaticArrays
Ideas:
reduce
+map
withmapreduce
stack(t)
withhcat(t...)
becauset
will always be a shortNTuple
Related:
hessian
fix #561 where my first attempt failedBenchmarks:
stack
is better onArray
but worse onSArray
. The solution is to fixstack
forSArray
, at least in simple cases.The text was updated successfully, but these errors were encountered: