-
Notifications
You must be signed in to change notification settings - Fork 32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Confusion regarding .~
#722
Comments
How significant are the perf benefits? Could we benchmark this? Either of the proposed two solutions would also simplify the tilde pipeline code a lot, improving maintainability. It's a source of a lot of corner cases and code complexity. |
We could ask on Slack/Discourse whether our users are making heavy use of |
I lean towards option 2 or something similar if it simplifies the implementation. The tilde pipeline is becoming a bit too (deep and) complex to reason about. |
I tried to make a simple benchmark to see the effect of julia> module MWE
using Turing
import ReverseDiff
adbackend = AutoReverseDiff()
alg = NUTS(; adtype=adbackend)
n_samples = 20
y = randn(1000)
@model function test_tilde(y, ::Type{T}=Float64) where {T}
num_values = length(y)
x1 = Vector{T}(undef, num_values)
x1 .~ Normal()
x2 = Vector{T}(undef, num_values)
x2 .~ Gamma()
m = sum(x1 .* x2)
y .~ Normal(m)
end
m_tilde = test_tilde(y)
@time sample(m_tilde, alg, n_samples)
@model function test_loop(y, ::Type{T}=Float64) where {T}
num_values = length(y)
x1 = Vector{T}(undef, num_values)
for i in eachindex(x1)
x1[i] ~ Normal()
end
x2 = Vector{T}(undef, num_values)
for i in eachindex(x2)
x2[i] ~ Gamma()
end
m = sum(x1 .* x2)
for i in eachindex(y)
y[i] ~ Normal(m)
end
end
m_loop = test_loop(y)
@time sample(m_loop, alg, n_samples)
end
┌ Info: Found initial step size
└ ϵ = 0.000390625
Sampling 100%|█████████████████████████████████████████████████████████████████████████████████| Time: 0:03:03
186.599898 seconds (3.79 G allocations: 166.483 GiB, 9.57% gc time, 1.02% compilation time)
┌ Info: Found initial step size
└ ϵ = 0.000390625
Sampling 100%|█████████████████████████████████████████████████████████████████████████████████| Time: 0:03:33
216.282827 seconds (4.17 G allocations: 181.365 GiB, 8.79% gc time, 0.91% compilation time)
Main.MWE Is there some case where it makes a more dramatic difference? If so, can someone write an example? Unless it makes a very substantial difference in some common use case, I'm in favour of keeping the syntax for backwards compatibility, but turning every .~ into a loop over eachindex(lhs) at macro time and internally dealing with only ~. I haven't thought through how that would work for more complex broadcasting than the LHS being a vector and RHS being a scalar. |
Ref: TuringLang/Turing.jl#2390 (comment)
MWE:
Even if you use
condition
, you still run into a similar issue when specifying components, i.e.The reason why we do this is because of performance: we want
.~
to be faster than just iterating over the entire broadcast expression and callingtilde_assume!!
andtilde_observe!!
separately. However, this is very, very confusing for users, and I think it's very understandable.Sooo the question is how to fix this. A few immediate thoughts:
.~
? Does it just make things confusing?The text was updated successfully, but these errors were encountered: