-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor chi-squared distribution #77
base: master
Are you sure you want to change the base?
Conversation
Codecov Report
@@ Coverage Diff @@
## master #77 +/- ##
==========================================
+ Coverage 92.24% 92.67% +0.42%
==========================================
Files 44 44
Lines 7187 7299 +112
==========================================
+ Hits 6630 6764 +134
+ Misses 557 535 -22
Continue to review full report at Codecov.
|
src/distribution/chi_squared.rs
Outdated
if x > 0.0 { | ||
self.g.ln_pdf(x) | ||
} else { | ||
-f64::INFINITY |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
conventionally we've been using f64::NEG_INFINITY
, I'm not actually 100% sure -f64::INFINITY == f64::NEG_INFINITY
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I completely missed this one.
BTW, I didn't know that these constants are defined like this 😁 :
pub const INFINITY: f64 = 1.0f64 / 0.0f64
pub const NEG_INFINITY: f64 = -1.0f64 / 0.0f64
@@ -310,7 +310,11 @@ impl Continuous<f64, f64> for ChiSquared { | |||
/// | |||
/// where `k` is the degrees of freedom and `Γ` is the gamma function | |||
fn pdf(&self, x: f64) -> f64 { | |||
self.g.pdf(x) | |||
if x > 0.0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can be more specific, x == 0.0
is defined (at least according to Wikipedia) as long as freedom != 1.0
, so we can probably do something like if x > 0.0 || (self.freedom != 1.0 && x == 0.0)
. May want to consider prec::almost_eq
instead of ==
or !=
for floats but I'm just spitballing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I screwed up here. Should've done better research.
The thing is a little bit more complicated. Wikipedia silently assumes only integer degrees of freedom. If we take freedom
as a positive real, then we have 3 cases:
0 < freedom < 2
:pdf(0) = +∞
, hence0
is excluded from the supportfreedom == 2
:pdf(0) = 0.5
2 < freedom
:pdf(0) = 0
This is obvious from the pdf formula (see Wikipedia), specifically the x^(k/2 - 1)
term. All other terms are non-zero for any freedom
.
Anyway, I've tried out scipy
's implementation of chi-squared. Its behaviour is exactly as above (i.e. +∞
, 0.5
and 0
).
So the question is, what to do for the case when 0 < freedom < 2
: Should we let the pdf function return +∞
(as scipy
does)? Or should we handle it as a separate case?
My personal preference right now is to return +∞
, but document this behaviour in the documentation.
Edit: wording.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that's fine by me as long as it's clear in the documentation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok! Just to make sure: You prefer to return +∞
when 0 < freedom < 2
and pdf is evaluated at 0
.
Then I'll rewrite the unit tests and update the docs.
test_almost(2.5, 0.045412171451573920401, 1e-15, |x| x.pdf(5.5)); | ||
test_almost(2.5, 1.8574923023527248767e-24, 1e-36, |x| x.pdf(110.1)); | ||
test_case(2.5, 0.0, |x| x.pdf(f64::INFINITY)); | ||
// test_case(f64::INFINITY, 0.0, |x| x.pdf(0.0)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does pdf
return for freedom == f64::INFINITY
normally? I'd like to have this case defined but am not necessarily sure we need an explicit branch in the code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It returns a NaN
. I guess as a result of 0 * Inf
when computing the pdf.
TBH I don't like the idea of returning 0
instead of NaN
because it doesn't make sense for me... Shouldn't a pdf integrate to 1
over support? How can it be upheld when pdf()
returns 0
everywhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point. I think maybe being explicit about the NaN
would be good, like adding a specific branch and note it in the docstring
Yup, I think being consistent with scipy is probably a good thing
…On Oct 30, 2017 11:52 AM, "Mikhail Pak" ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In src/distribution/chi_squared.rs
<#77 (comment)>:
> @@ -310,7 +310,11 @@ impl Continuous<f64, f64> for ChiSquared {
///
/// where `k` is the degrees of freedom and `Γ` is the gamma function
fn pdf(&self, x: f64) -> f64 {
- self.g.pdf(x)
+ if x > 0.0 {
Ok! Just to make sure: You prefer to return +∞ when 0 < freedom < 2 and
pdf is evaluated at 0.
Then I'll rewrite the unit tests and update the docs.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#77 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ACwkvpSBpFsyFbCHnZeBEr1mygWosyVlks5sxfDagaJpZM4P_MEl>
.
|
Mode cannot be negative
85d6884
to
2df72f2
Compare
Hey @mp4096 what's the status on this PR? Do you need any help bringing it across the finish line? |
Ok, this is a larger one:
Mode
Fix formula for the mode. Math.NET implementation has an error for
freedom < 2
and faulty unit tests as well.Pdf and log density
When investigating this unit test:
I realised the problem two-fold: First, the old pdf implementation returned
Inf
at0
instead of0
. Second, forfreedom == 1.0
the pdf is very steep around0
and cannot be integrated with the hard-coded step size. So I implemented a branching in the pdf and log density (also see Wikipedia) and changedfreedom
to1.5
.Actually, Math.NET has a little bit more branching in the chi-squared pdf, but the bulk of it is covered in
statrs
'sgamma.rs
implementation. I'm not sure if we want to cover the case offreedom = Inf
. @boxtown What's your opinion on this? If yay, I'll add this branching topdf()
andln_pdf()
and uncomment the unit tests. If nay, I'll just delete the commented unit tests.Misc
And of course added more unit tests, but this is kind of self-explanatory.