-
Notifications
You must be signed in to change notification settings - Fork 421
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RFC: First take on cleaning up EmpiricalUnivariateDistribution #661
Conversation
b7de190
to
b9fcdf5
Compare
Maybe parametrize the distribution with T<:Real instead of explicitly using Float64? |
Hm. It looks like we are assuming that I'd also really like to rename this distribution to |
I am in favor of new names, but ideally they would take into account continuous vs. discrete cases. I also agree that the implementation was quite broken to deserve a deprecation warning. I am particularly interested in the continuous case as discussed here: https://discourse.julialang.org/t/empirical-distribution-type-for-continuous-variables/5676 In one of my spatial stats solvers I need to transform continuous empirical distributions to Normal, do the calculations, and transform back to the initial empirical distribution. In that post, I wrote a simple transformation function that I am pasting here again: """
transform(x, dist)
Transform the empirical distribution of samples `x` into `dist`.
"""
function transform(x::Vector, dist::ContinuousUnivariateDistribution)
idx = sortperm(x)
N = length(x)
xcdf = [1:N-1; .99N] / N
quantile(dist, xcdf[idx])
end This function could be incorporated in Distributions.jl as well, it only depends on a Distribution type for which |
@andreasnoack there seem to be conflicts to solve. Afterwards what's the status for this PR? |
It stalled when I realized that it is implicitly assumed that |
Agreed, this joins other issues, like #771 |
Should we submit an issue documenting what the intent is and the requirements on the type hierarchy, then close this PR till the required changes are carried out? |
I think we keep this one open as it should be fine provided we get an appropriate distribution supertype for the distribution. I think #681 might cover, at least some of, what is needed. |
This one was pretty broken. I think I'll change the
mean
pdf
function to utilize that the vector is sorted.