Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

broadcasting over arrays of small unions #27106

Open
JeffreySarnoff opened this issue May 15, 2018 · 6 comments
Open

broadcasting over arrays of small unions #27106

JeffreySarnoff opened this issue May 15, 2018 · 6 comments
Labels
broadcast Applying a function over a collection compiler:inference Type inference missing data Base.missing and related functionality

Comments

@JeffreySarnoff
Copy link
Contributor

@KristofferC noticed when the number of elements in the union goes over two, inference falls back to Any.

julia> Base._return_type(square, Tuple{Union{Missing, Float64}})
Union{Missing, Float64}
julia> Base._return_type(square, Tuple{Union{Missing, Float64, Float64a}})
Any

Working with vectors, matrices and arrays of elements typed as a Union of 2, 3, or 4 leaf types has become very fast thanks to much work. The fact that, at the moment, broadcasting fails to autogenerate arrays that match the source's type (array of 3 or 4 unioned leaf types) leaves a lot of low hanging fruit on the vine.
For example, a data vectors designed to handle Float64 and Int64 values (perhaps with sort of source) which support missing values requires Union{Missing, Int64, Float64}. All the machinery seems present.

@KristofferC
Copy link
Member

KristofferC commented May 15, 2018

@nalimilan nalimilan added compiler:inference Type inference missing data Base.missing and related functionality labels May 15, 2018
@nalimilan
Copy link
Member

This mixes two different issues:

  • what's the element type of the result of broadcast
  • what's the inferred type of broadcasted functions

The former is chosen using Base.promote_typejoin, which special-cases Missing and Nothing (#25553). We have discussed using a more general system (see links in that PR), but it's hard to decide when a Union should be preserved and when it contains so many different types that typejoin should be used instead: should we stop at 3, 4, 5 types? You should be able to override promote_typejoin for very specific types, but it's not really recommended in general.

Whether inference should be more precise is a completely different question, which fortunately can be changed at any point without breaking code: it's just a matter of finding the best threshold for performance.

@KristofferC
Copy link
Member

KristofferC commented May 15, 2018

Isn't

combine_eltypes(f, args::Tuple) = Base._return_type(f, eltypes(args))

determining the element type of the result?

Edit: Oh, that's only when the returnt ype is concrete?

@nalimilan
Copy link
Member

AFAIK inference is supposed to be used only when the result is empty. But indeed the result of combine_eltypes now seems to be used also when it isn't concrete, which I wouldn't have expected. I guess @mbauman can clarify.

@JeffBezanson
Copy link
Member

@nalimilan is absolutely right! This is a perfect example of the problem with return_type. Inference needs to be able to widen unions at some point, or the compiler will be too slow.

@JeffBezanson JeffBezanson added the broadcast Applying a function over a collection label May 15, 2018
@mbauman
Copy link
Member

mbauman commented May 15, 2018

But indeed the result of combine_eltypes now seems to be used also when it isn't concrete, which I wouldn't have expected.

Where do you see or observe this? We only use the return type if it is concrete or if the result is empty. We are still doing it poorly in the sparse broadcast code, but that's the only place I'm aware broadcast does this incorrectly.

julia/base/broadcast.jl

Lines 762 to 766 in 1b92f51

ElType = combine_eltypes(bc.f, bc.args)
if Base.isconcretetype(ElType)
# We can trust it and defer to the simpler `copyto!`
return copyto!(broadcast_similar(Style(), ElType, axes(bc), bc), bc)
end

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
broadcast Applying a function over a collection compiler:inference Type inference missing data Base.missing and related functionality
Projects
None yet
Development

No branches or pull requests

5 participants