-
Notifications
You must be signed in to change notification settings - Fork 423
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change to ValueSupport interface #951
Conversation
…ypes for countable and continuous support respectively, and discontinuous distributions in general.
Co-Authored-By: Mathieu Besançon <mathieu.besancon@gmail.com>
Co-Authored-By: Mathieu Besançon <mathieu.besancon@gmail.com>
…amework. Minor bugfix for MixtureModel with non-fully parameterised types and add testing.
Introduce Integer*Distribution, and parameterise Continuous*Distribution{T} for uncountable T. Fix lots of bugs where distributions were incorrectly reporting their types.
…where the support is over contiguous integers (as most are).
Fix most countable distributions to use ContiguousSupport.
More ContiguousSupport fixes, and updating testing.
Codecov Report
@@ Coverage Diff @@
## master #951 +/- ##
==========================================
+ Coverage 78.05% 78.22% +0.16%
==========================================
Files 112 112
Lines 5382 5410 +28
==========================================
+ Hits 4201 4232 +31
+ Misses 1181 1178 -3
Continue to review full report at Codecov.
|
This PR is no longer breaking, because code has been added in to |
Thank you, this is already nicer. This is also much closer in spirit to what I have in mind -- but again we do not need to be breaking: we can just avoid making the As I keep arguing, |
Can I stress that this is not breaking, and also that it has been here for weeks - #945 was just the type hierarchy split out. I'll look at the other issues later when I have time. |
As far as I can see, your example problems above are cases where people have previously made exactly the class of errors I am proposing to eliminate (#896) - I have fixed 10s of these bugs in this PR - or are fixing times when people have got their types confused. I certainly don't see anything I'm doing as needlessly enforcing type restrictions - I don't believe I add any more restrictions, though it is possible I have inadvertently - I am just correctly parameterising on the types already used to avoid this class of error arising. There's a lot more work to be done here incidentally. What I'd like to be able to do down the line is to seamlessly handle any non- |
@richardreeve can you get the last bit of coverage up? I'm not always a fan of it but it helps to keep healthy habits for such large code base |
src/utils.jl
Outdated
@@ -9,6 +9,38 @@ macro check_args(D, cond) | |||
end | |||
end | |||
|
|||
isuncountable(x) = false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would be for adding only one of isuncountable
, iscountable
, given that one is the negation of the other.
iscountablereal
should also not be necessary, and can rely on iscountable
and type dispatch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, there's a problem there.
Nearly all distributions cannot currently handle not-Float64
eltypes, but will be able to handle <: Real
eltypes with only a modest amount of work. However, it's much harder to get them to handle (un)countable non-real <: Number
eltypes, such as Unitful types. Nonetheless, the aspiration should be to handle all of these types, so I want to be able to move easily from the iscountablereal(T)
to iscountable(T)
checks which have been created (@check_countable()
, etc.).
Similarly, iscountable(T)
is not just !isuncountable(T)
. Non-Number
types are rejected by both for now, as they are well beyond the scope of the current series of PRs. I could call them iscountablenumbner()
and isuncountablenumber()
to make it clear that they are not inverses?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PS The other option I thought of (as you suggest) was to have type dispatch on the macro for the time being (which is the only user of isuncountablereal()
, and this function is also not exported incidentally), but I assume that's still not possible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I've got rid of this conflict - I've added three level logic - missing
- for the cases where we haven't yet resolved whether it's countable or uncountable, then the two are inverses of each other.
The overall coverage does go up. The problem with the patch coverage is that there is code in the PR that isn’t used yet, because the actual distributions that use it are in the next PR! The easiest way of increasing coverage for that is to add in the new distributions :) |
@richardreeve can you fix the two conflicts? The PR that created them was tiny boilerplate, so it shouldn't be a big deal |
Nevermind, I fixed the two conflicts |
src/deprecates.jl
Outdated
const DiscreteMultivariateDistribution = Distribution{Multivariate, Discrete} | ||
const ContinuousMultivariateDistribution = Distribution{Multivariate, Continuous} | ||
const DiscreteMatrixDistribution = Distribution{Matrixvariate, Discrete} | ||
const ContinuousMatrixDistribution = Distribution{Matrixvariate, Continuous} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use Base.@deprecate_binding
for these?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good idea thanks, it preserves compatibility but signals what is changing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool. I had no idea that existed - I see it isn't exported in fact, so I don't feel too bad. Anyway, I'll definitely do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I notice that it doesn't handle parametric types, so I can't do it for all of the deprecated types:
const DiscreteDistribution{F<:VariateForm} = Distribution{F,Discrete}
const ContinuousDistribution{F<:VariateForm} = Distribution{F,Continuous}
but I've covered most of them now. However, testing them is another matter:
julia> @test_deprecated DiscreteMatrixDistribution
WARNING: Distributions.DiscreteMatrixDistribution is deprecated, use Distribution{Matrixvariate, Discrete} instead.
likely near REPL[6]:1
Log Test Failed at REPL[6]:1
Expression: $(Expr(:escape, :DiscreteMatrixDistribution))
Log Pattern: (:warn, r"deprecated"i, Ignored(), :depwarn) match_mode = :any
Captured Logs:
ERROR: There was an error during testing
Is the message returned by @deprecate_binding
not a deprecation warning?
OK @matbesancon. We’re now ahead on all metrics and These issues are dealt with... |
abstract type ValueSupport end | ||
struct Discrete <: ValueSupport end | ||
struct Continuous <: ValueSupport end | ||
`S <: Support{T}` specifies the support of sample elements as T, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
support
of a measure has a specific conflicting definition (https://en.wikipedia.org/wiki/Support_(measure_theory)) we might want to call this differently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What definition are we going to use. What is the support of a Bernoulli(1.0)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we go with T
being the eltype
of samples (see below), we should specify this here.
struct ContinuousSupport{N <: Number} <: Support{N} end | ||
abstract type CountableSupport{C} <: Support{C} end | ||
struct ContiguousSupport{C <: Integer} <: CountableSupport{C} end | ||
struct UnionSupport{N1, N2, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Document
src/common.jl
Outdated
|
||
The default element type of a sample. This is the type of elements of the samples generated | ||
by the `rand` method. However, one can provide an array of different element types to | ||
store the samples using `rand!`. | ||
""" | ||
Base.eltype(s::Sampleable{F,Discrete}) where {F} = Int | ||
Base.eltype(s::Sampleable{F,Continuous}) where {F} = Float64 | ||
Base.eltype(::Sampleable{F, <: Support{N}}) where {F, N} = N |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this supposed to be the type of rand(d)
or the eltype
of rand(d)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Below it is eltype
of rand(d)
.
value_support(::Type{Distribution{VF,VS}}) where {VF<:VariateForm,VS<:ValueSupport} = VS | ||
value_support(::Type{T}) where {T<:Distribution} = value_support(supertype(T)) | ||
variate_form(::Type{<:Sampleable{VF, <:Support}}) where {VF<:VariateForm} = VF | ||
value_support(::Type{<:Sampleable{<:VariateForm,VS}}) where {VS<:Support} = VS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be called support
now/later?
@@ -116,18 +114,18 @@ const DistributionType{D<:Distribution} = Type{D} | |||
const IncompleteFormulation = Union{DistributionType,IncompleteDistribution} | |||
|
|||
""" | |||
succprob(d::DiscreteUnivariateDistribution) | |||
succprob(d::UnivariateDistribution{ContiguousSupport{T}}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
succprob(d::UnivariateDistribution{ContiguousSupport{T}}) | |
succprob(d::UnivariateDistribution{<:ContiguousSupport}) |
ccdf(d::DiscreteUnivariateDistribution, x::Integer) = 1.0 - cdf(d, x) | ||
ccdf(d::DiscreteUnivariateDistribution, x::Real) = ccdf(d, floor(Int,x)) | ||
ccdf(d::UnivariateDistribution, x) = 1.0 - cdf(d, x) | ||
ccdf(d::UnivariateDistribution{<:ContiguousSupport}, x::Integer) = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Method not needed?
logcdf(d::DiscreteUnivariateDistribution, x::Real) = logcdf(d, floor(Int,x)) | ||
logcdf(d::UnivariateDistribution, x) = log(cdf(d, x)) | ||
logcdf(d::UnivariateDistribution{<:ContiguousSupport}, x::Integer) = | ||
log(cdf(d, x)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
method not needed?
@@ -439,10 +463,13 @@ invlogccdf(d::UnivariateDistribution, lp::Real) = quantile(d, -expm1(lp)) | |||
|
|||
# gradlogpdf | |||
|
|||
gradlogpdf(d::ContinuousUnivariateDistribution, x::Real) = throw(MethodError(gradlogpdf, (d, x))) | |||
gradlogpdf(d::UnivariateDistribution{<:ContinuousSupport}, x::Real) = | |||
throw(MethodError(gradlogpdf, (d, x))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this method needed?
@@ -522,7 +551,7 @@ loglikelihood(d::UnivariateDistribution, X::AbstractArray) = sum(x -> logpdf(d, | |||
|
|||
macro _delegate_statsfuns(D, fpre, psyms...) | |||
dt = eval(D) | |||
T = dt <: DiscreteUnivariateDistribution ? :Int : :Real | |||
T = eltype(dt) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before T
was a symbol, does this work as well?
@@ -9,6 +9,34 @@ macro check_args(D, cond) | |||
end | |||
end | |||
|
|||
isuncountable(x) = missing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doc string?
@richardreeve I updated the branch to fix conflicts, can you check @mschauer's comments above and make the corresponding modifications? |
I'll try to get it done as soon as possible, but this is my busiest time of the year for teaching - the summer was when I had time to work on this unfortunately. |
end | ||
|
||
|
||
## a type to indicate zero vector |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@richardreeve I messed up my conflict merge, could you remove things related to ZeroVector in this file, it was deleted in:
#1020
This change to the interface removes
Discrete
andContinuous
from the code as aliases forContiguousSupport{Int}
andContinuousSupport{Float64}
- introduced in #945 - from a suggestion by @matbesancon, and to be done after or instead of #945, which it currently includes. All of theDiscrete[*]Distribution
andContinuous[*]Distribution
are removed from the code too, as there was concern expressed in #945 that I was introducing too many type aliases. This is done because most of the distributions that use these aren't actually constrained to haveInt
orFloat64
as their `eltype, and there were problems with breaking some external packages.This could be made breaking by removing
Discrete
andContinuous
and the*Distribution
aliases fromsrc/deprecates.jl
where they have been moved to, but this wouldn't affect theDistributions
package as it no longer uses them at all (except intest/types.jl
). Keeping them in might discourage people from fixing their code. Unfortunately I can't see how to deprecate them as they are used as const aliases for types, andtypeof(DeprecatedThing)
isn't a type...The PR would still be mildly breaking even then, though, asContinuousSupport
becomesContinuousSupport{T<:Number}
, and this can't be avoided unless someone has a synonym for continuous. I'm not sure how many people useContinuousSupport
outside the package though - I can't find it on github in any julia code. (UncountableSupport{T}
is an option, but at the momentContinuousSupport
requires support to be over a single continuous domain...)Edit: I'd forgotten that I introduced
ContinuousSupport
in #945!