Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve correlation tests. #287

Merged
merged 2 commits into from
Apr 3, 2024
Merged

Improve correlation tests. #287

merged 2 commits into from
Apr 3, 2024

Conversation

plietar
Copy link
Member

@plietar plietar commented Mar 26, 2024

The tests for the correlation parameters were using a tolerance of 1e2 everywhere. Since the tolerance is relative, assertions would effectively succeed for any value within a few orders of magnitude. Setting obviously wrong expected results in the assertions did not produce any errors.

This switches the tolerance to 0.1. This was chosen through experimenting, as something that didn't cause false negative even after many runs, while also being reasonably close. When using 0.01 I did get 1 failure in 100 runs of the test suite.

Some of the assertions were using incorrected expected values, which flew under the radar because of the huge tolerance. I've also cleared up the tests and bit and made them more consitent with one another.

Finally I changed the CorrelationParameters constructor to accept only the values it needs (population size and intervention booleans), rather than the full simulation parameters. This makes the test a bit more concise, and will also help with upcoming tests that work at restoring correlation state while adding interventions. The existing public wrapper get_correlation_parameters still has the same interface as before.

The tests for the correlation parameters were using a tolerance of 1e2
everywhere. Since the tolerance is relative, assertions would
effectively succeed for any value within a few orders of magnitude.
Setting obviously wrong expected results in the assertions did not
produce any errors.

This switches the tolerance to 0.1. This was chosen through
experimenting, as something that didn't cause false negative even after
many runs, while also being reasonably close. When using 0.01 I did get
1 failure in 100 runs of the test suite.

Some of the assertions were using incorrected expected values, which
flew under the radar because of the huge tolerance. I've also cleared up
the tests and bit and made them more consitent with one another.

Finally I changed the `CorrelationParameters` constructor to accept only
the values it needs (population size and intervention booleans), rather
than the full simulation parameters. This makes the test a bit more
concise, and will also help with upcoming tests that work at restoring
correlation state while adding interventions. The existing public
wrapper `get_correlation_parameters` still has the same interface as
before.
@plietar plietar requested a review from giovannic March 26, 2024 15:58
Copy link
Member

@giovannic giovannic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for spotting this! Much cleaner tests too

Comment on lines 64 to 65
expect_equal(sum(vaccine_sample & mda_sample), pop * .04, tolerance=.1)
expect_equal(sum(vaccine_sample | mda_sample), pop * .36, tolerance=.1)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where do these population proportions come from? I would expect .1 and .3

Perhaps it would be clearer to replace vaccine_coverage and mda_coverage with coverage <- .2. And then test

Suggested change
expect_equal(sum(vaccine_sample & mda_sample), pop * .04, tolerance=.1)
expect_equal(sum(vaccine_sample | mda_sample), pop * .36, tolerance=.1)
expect_equal(sum(vaccine_sample & mda_sample), pop * coverage * .5, tolerance=.1)
expect_equal(sum(vaccine_sample | mda_sample), pop * coverage * 1.5, tolerance=.1)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With a correlation of zero, these are independent events so the probability of being in both is vaccine_coverage * mda_coverage, ie. 0.04.

The probability of being in either/or is vaccine + mda - (vaccine * mda), or 0.2 + 0.2 - 0.04 = 0.36.

I agree having the explicit calculation is clearer, I'll change to that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the explicit calculations.

I actually decided to go the opposite direction, keeping the separate variables for pev and mda but using different values for each of them. This feels like a more meaningful test. I've renamed vaccine_xx with pev_xx to match the name of the intervention passed as a string.

@plietar plietar requested a review from giovannic March 27, 2024 15:11
@plietar plietar merged commit 2a3d4cc into dev Apr 3, 2024
4 checks passed
@plietar plietar deleted the correlations-fix branch April 3, 2024 09:56
@giovannic giovannic mentioned this pull request Sep 11, 2024
Merged
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants