specialize loops inside functions #5654

johnmyleswhite · 2014-02-03T02:03:23Z

While trying to optimize performance of operations on DataFrames, I've found that a recurring pattern underlies a lot of the performance bottlenecks people encounter: functions that take in DataFrames as input don't have enough type information to specialize their implementations for the specific types of the columns stored in a DataFrame. This means that a two-part function that (a) selects a subset of columns to compute on and then (b) performs an intensive computation on those columns can't take advantage of the type information available at the end of part A to improve the performance of part B.

It's possible to articulate this pattern without any reference to DataFrames: you get the exact same effect if you try pulling whole columns out of a Vector{Any} and then computing on them. The only workaround I've found is to use an inner function defined inside of the two-part function. This inner function helps to defer type specialization until a time at which it can be carried out effectively.

The three examples below highlight this problem:

function dot1(vals::Vector{Any})
    a, b = vals[1], vals[2]
    x = 0.0
    for i in 1:length(a)
        x += a[i] * b[i]
    end
    return x
end

function dot2(a::Vector, b::Vector)
    x = 0.0
    for i in 1:length(a)
        x += a[i] * b[i]
    end
    return x
end

function dot3(vals::Vector{Any})
    a, b = vals[1], vals[2]
    function inner(a, b)
        x = 0.0
        for i in 1:length(a)
            x += a[i] * b[i]
        end
        return x
    end
    return inner(a, b)
end

If you run these functions with the following data, you get very substantial performance differences between the initial naive implementation and the other two:

n = 1_000_000
a, b = randn(n), randn(n)
vals = {a, b}
dot1(vals)
dot2(a, b)
dot3(vals)

@elapsed dot1(vals)
@elapsed dot2(a, b)
@elapsed dot3(vals)

On my machine, I get relative timings of the following orders of magnitude:

dot1: 230x slower
dot2: 1x
dot3: 1x

It would be a huge gain to future users of DataFrames if there were a way to make the compiler recognize that a little bit of second-stage type inference could give major performance gains.

The text was updated successfully, but these errors were encountered:

cdsousa · 2014-02-03T03:20:52Z

The dot3(a, b) warm up must be dot3(vals).
In my system dot3 is as performant as dot2, maybe you get that time in dot3 due to it not being correctly warmed up...

johnmyleswhite · 2014-02-03T03:22:50Z

You're right. Thanks for catching that!

quinnj · 2014-08-14T14:39:14Z

Superseded by #7311.

simonster · 2014-08-14T14:56:37Z

@quinnj I'm not sure how staged functions solve this. Can you elaborate?

JeffBezanson · 2014-08-14T15:05:17Z

Yeah, this is not the same issue.

quinnj · 2014-08-14T15:30:43Z

Sorry. I thought that

if there were a way to make the compiler recognize that a little bit of second-stage type inference could give major performance gains

was exactly what staged functions were getting us.

JeffBezanson · 2014-08-14T15:43:58Z

The work item here is to identify poorly-typed loops, and carve them out into separate functions that can be specialized at run time. It's an extra compiler optimization pass.

timholy · 2014-08-14T16:02:12Z

@quinnj, a staged function will receive that ::Vector{Any} as its input, so it won't have any more type information than the parent.

Keno · 2014-08-14T16:03:41Z

While that is true, it can actually throw an error in that case, in which case it will be called again at runtime with the correct types. That's part of the magic :).

timholy · 2014-08-14T16:07:59Z

Not sure I understand. What if I explicitly feed it an Any[[1,2,3],[1.0,2.0,3.0]]? At no point will you get good type information about this object.

Now, if you wrote it as

function dot4{T}(vals::T)
...

and passed it ([1,2,3],[1.0,2.0,3.0]) then it would be a different story. But you don't need a staged function for that.

Keno · 2014-08-14T16:12:01Z

I think the issue is a case where the runtime type information is there but the compile time one isn't so the loop won't be optimized. If there's never good type information then we can't do anything anyway.

barucden · 2022-03-21T14:30:25Z

~~Is this still an issue?~~ On 1.7.2, I see this result on my machine:

julia> using BenchmarkTools

julia> n = 1_000_000
1000000

julia> a, b = randn(n), randn(n);

julia> vals = Any[a, b];

julia> function dot1(vals::Vector)
           a, b = vals[1], vals[2]
           x = 0.0
           for i in 1:length(a)
               x += a[i] * b[i]
           end
           return x
       end
dot1 (generic function with 1 method)

julia> function dot2(a::Vector, b::Vector)
           x = 0.0
           for i in 1:length(a)
               x += a[i] * b[i]
           end
           return x
       end
dot2 (generic function with 1 method)

julia> function dot3(vals::Vector)
           a, b = vals[1], vals[2]
           function inner(a, b)
               x = 0.0
               for i in 1:length(a)
                   x += a[i] * b[i]
               end
               return x
           end
           return inner(a, b)
       end
dot3 (generic function with 1 method)

julia> @btime dot1(vals);
  158.501 ms (6998980 allocations: 122.05 MiB)

julia> @btime dot2(a, b);
  1.528 ms (1 allocation: 16 bytes)

julia> @btime dot3(vals);
  1.529 ms (1 allocation: 16 bytes)

EDIT: Updated the measurement according to @Keno's remark. The issue still stands.

Keno · 2022-03-21T15:00:11Z

The issue is for vals = Any[a, b];. The {} syntax was removed many years ago, but corresponds to Any[].

simonster mentioned this issue Aug 7, 2014

Performance issue with open(f, file) #7901

Closed

quinnj closed this as completed Aug 14, 2014

JeffBezanson reopened this Aug 14, 2014

tshort mentioned this issue Dec 17, 2014

DataFrames replacement that has no type-uncertainty JuliaData/DataFrames.jl#744

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

specialize loops inside functions #5654

specialize loops inside functions #5654

johnmyleswhite commented Feb 3, 2014

cdsousa commented Feb 3, 2014

johnmyleswhite commented Feb 3, 2014

quinnj commented Aug 14, 2014

simonster commented Aug 14, 2014

JeffBezanson commented Aug 14, 2014

quinnj commented Aug 14, 2014

JeffBezanson commented Aug 14, 2014

timholy commented Aug 14, 2014

Keno commented Aug 14, 2014

timholy commented Aug 14, 2014

Keno commented Aug 14, 2014

barucden commented Mar 21, 2022 •

edited

Loading

Keno commented Mar 21, 2022

specialize loops inside functions #5654

specialize loops inside functions #5654

Comments

johnmyleswhite commented Feb 3, 2014

cdsousa commented Feb 3, 2014

johnmyleswhite commented Feb 3, 2014

quinnj commented Aug 14, 2014

simonster commented Aug 14, 2014

JeffBezanson commented Aug 14, 2014

quinnj commented Aug 14, 2014

JeffBezanson commented Aug 14, 2014

timholy commented Aug 14, 2014

Keno commented Aug 14, 2014

timholy commented Aug 14, 2014

Keno commented Aug 14, 2014

barucden commented Mar 21, 2022 • edited Loading

Keno commented Mar 21, 2022

barucden commented Mar 21, 2022 •

edited

Loading