-
Notifications
You must be signed in to change notification settings - Fork 422
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add DiscreteNonParametric distribution #634
Add DiscreteNonParametric distribution #634
Conversation
Thanks for contributing this. This one is extremely similar to |
Sounds good - they already share the sampler and it should be easy to consolidate most of the other functionality as well. Edit: Thinking about it more, if |
An option might be just to use a unit range for the value vector. It should be very light weight. In my book, this |
Agreed, I think that's the same approach as my edit. I can't think of any reasons combining them would need to be any less efficient, although I'll run some performance tests to verify that. Semantically, to me at least, a categorical distribution suggests that the support is non-numeric, i.e the integers in the support are just proxies for some other label, and their ordering/relative values are somewhat arbitrary. Given that connotation I would be hesitant to label a distribution over a potentially-quantitatively-meaningful real-valued support as struct Generic{T, P, S <: AbstractVector{T}} ... end
Categorical{P} = Generic{Int, P, Base.OneTo{Int}}
|
Just a bikeshedding note: what about Not that "atomic" doesn't have other meanings in software... |
Still more work to be done, but I've converted As far as bikeshedding the name, a few other options:
|
11a576e
to
7a56072
Compare
7a56072
to
a18623f
Compare
I think all of the missing methods should be implemented now, but correct me if I've missed anything. As noted in #661 (comment), the support of |
Just wanted to state that it would be usefull for me :-) Also, if I may, Generic might not be understandable enough, Atomic as proposed earlier might be better, or GenericDiscrete... |
@GordStephen could you rebase the PR on master before we can move on with this work? There are conflicts |
Sure, will do |
Thanks! |
9e45f27
to
236d652
Compare
Ok, this is rebased and updated for 0.7. |
I like |
Hi. Thanks for your work of course. I don't like anything with Perhaps if it was just an extension of Categorical, with some extra type parameter. Or maybe |
"fancy" is not always bad, at least it's specific and associated with discrete distributions. We have to keep in mind that the literature does not have clear universal terms for these concepts (discrete over a setthat's not in N), so it's fine to come up with new, specific terms for this |
In many words, this is a "general discrete distrubition with a finite number of atoms", right? I like the idea of coming up with a new specific term. But it shouldn't already mean something that's in conflict with other existing distributions, such as just |
@GordStephen while some thoughts are still given on the naming, could you add some documentation in |
Will do, although I'm traveling a fair bit this week and next so there might be some delay. |
bumping this one if you have time to add the doc pages & docstring |
I've added some docstrings and updated the Univariate docs page, but let me know if there should be updates anywhere else. I also renamed the distribution to |
|
Agreed, if you can make the required modifications
…On Mon, Oct 29, 2018, 23:13 Gord Stephen ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In src/samplers/categorical.jl
<#634 (comment)>
:
> @@ -4,7 +4,7 @@ struct CategoricalDirectSampler <: Sampleable{Univariate,Discrete}
prob::Vector{Float64}
function CategoricalDirectSampler(p::Vector{Float64})
Good point, I think more generally it would be natural to address #743
<#743> in this PR
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#634 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AHRRslaUha22BT8Yxihqf0qt8zVucmRpks5up32ngaJpZM4OSLui>
.
|
Ok, non- |
thanks a lot @GordStephen |
Doing some final checks, I realized the tests for this distribution didn't include |
Thanks, merged |
A bit late here but what I suggested in #634 (comment) was that we just used the name I think we should revisit the name since I believe |
The fact that |
To me the main distinction between To the point about whether this distribution is non-parametric, I suppose one could argue that it has parameters, just lots of them (a probability and value for each element of the support). That said, a distribution defined by a kernel density estimate has similar properties, and KDEs are considered to be defined non-parametrically. |
This implements a new generic, discrete, sparse distribution type for representing and sampling from an arbitrary PDF, as was discussed on Discourse a while ago. As of right now the type is called
Generic
, although there's probably a better name for for it...If there's interest in merging this, I can add documentation, more robust testing, and any outstanding functionality (
mfg
,cf
, etc).