Confusion regarding `.~` #722

torfjelde · 2024-11-27T10:03:36Z

Ref: TuringLang/Turing.jl#2390 (comment)

MWE:

julia> @model demo(x) = x .~ Normal()
demo (generic function with 2 methods)

julia> demo(missing)()
ERROR: MethodError: no method matching dot_assume(::Random.TaskLocalRNG, ::SampleFromPrior, ::Normal{…}, ::VarName{…}, ::Missing, ::UntypedVarInfo{…})
[...]

julia> demo([missing])()
ERROR: MethodError: no method matching loglikelihood(::Normal{Float64}, ::Vector{Union{Missing, Float64}})
[...]

Even if you use condition, you still run into a similar issue when specifying components, i.e.

julia> @model function demo()
           x = Vector{Float64}(undef, 2)
           x .~ Normal()
       end
demo (generic function with 4 methods)

julia> (demo() | (@varname(x[1]) => 1.0,))()  # not conditioned
2-element Vector{Float64}:
 1.6841936984140546
 0.3645020534551503

The reason why we do this is because of performance: we want .~ to be faster than just iterating over the entire broadcast expression and calling tilde_assume!! and tilde_observe!! separately. However, this is very, very confusing for users, and I think it's very understandable.

Sooo the question is how to fix this. A few immediate thoughts:

Do we need .~? Does it just make things confusing?
Should we just convert it into a loop over the broadcasted expression and drop the perf benefits?

The text was updated successfully, but these errors were encountered:

mhauru · 2024-11-27T11:47:08Z

How significant are the perf benefits? Could we benchmark this?

Either of the proposed two solutions would also simplify the tilde pipeline code a lot, improving maintainability. It's a source of a lot of corner cases and code complexity.

mhauru · 2024-11-27T12:00:22Z

We could ask on Slack/Discourse whether our users are making heavy use of .~, to gauge how much effort we should put into keeping it around and performant.

yebai · 2024-11-29T10:35:26Z

Do we need .~? Does it just make things confusing?

Should we just convert it into a loop over the broadcasted expression and drop the perf benefits?

I lean towards option 2 or something similar if it simplifies the implementation. The tilde pipeline is becoming a bit too (deep and) complex to reason about.

mhauru · 2025-01-28T15:05:22Z

I tried to make a simple benchmark to see the effect of .~. The results came out quite underwhelming:

julia> module MWE

       using Turing
       import ReverseDiff

       adbackend = AutoReverseDiff()
       alg = NUTS(; adtype=adbackend)
       n_samples = 20
       y = randn(1000)

       @model function test_tilde(y, ::Type{T}=Float64) where {T}
           num_values = length(y)
           x1 = Vector{T}(undef, num_values)
           x1 .~ Normal()
           x2 = Vector{T}(undef, num_values)
           x2 .~ Gamma()
           m = sum(x1 .* x2)
           y .~ Normal(m)
       end

       m_tilde = test_tilde(y)
       @time sample(m_tilde, alg, n_samples)

       @model function test_loop(y, ::Type{T}=Float64) where {T}
           num_values = length(y)
           x1 = Vector{T}(undef, num_values)
           for i in eachindex(x1)
               x1[i] ~ Normal()
           end
           x2 = Vector{T}(undef, num_values)
           for i in eachindex(x2)
               x2[i] ~ Gamma()
           end
           m = sum(x1 .* x2)
           for i in eachindex(y)
               y[i] ~ Normal(m)
           end
       end

       m_loop = test_loop(y)
       @time sample(m_loop, alg, n_samples)

       end
┌ Info: Found initial step size
└   ϵ = 0.000390625
Sampling 100%|█████████████████████████████████████████████████████████████████████████████████| Time: 0:03:03
186.599898 seconds (3.79 G allocations: 166.483 GiB, 9.57% gc time, 1.02% compilation time)
┌ Info: Found initial step size
└   ϵ = 0.000390625
Sampling 100%|█████████████████████████████████████████████████████████████████████████████████| Time: 0:03:33
216.282827 seconds (4.17 G allocations: 181.365 GiB, 8.79% gc time, 0.91% compilation time)
Main.MWE

Is there some case where it makes a more dramatic difference? If so, can someone write an example?

Unless it makes a very substantial difference in some common use case, I'm in favour of keeping the syntax for backwards compatibility, but turning every .~ into a loop over eachindex(lhs) at macro time and internally dealing with only ~. I haven't thought through how that would work for more complex broadcasting than the LHS being a vector and RHS being a scalar.

mhauru mentioned this issue Dec 5, 2024

Significantly simplify the tilde-pipeline in DynamicPPL TuringLang/Turing.jl#2422

Open

yebai mentioned this issue Jan 6, 2025

Issues with the dot-tilde (i.e. .~) syntax #761

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Confusion regarding `.~` #722

Confusion regarding `.~` #722

torfjelde commented Nov 27, 2024

mhauru commented Nov 27, 2024

mhauru commented Nov 27, 2024

yebai commented Nov 29, 2024

mhauru commented Jan 28, 2025 •

edited

Loading

Confusion regarding .~ #722

Confusion regarding .~ #722

Comments

torfjelde commented Nov 27, 2024

mhauru commented Nov 27, 2024

mhauru commented Nov 27, 2024

yebai commented Nov 29, 2024

mhauru commented Jan 28, 2025 • edited Loading

Confusion regarding `.~` #722

Confusion regarding `.~` #722

mhauru commented Jan 28, 2025 •

edited

Loading