Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Develop model_default_odin() for compatibility with vignettes #254

Open
adamkucharski opened this issue Oct 18, 2024 · 18 comments
Open

Develop model_default_odin() for compatibility with vignettes #254

adamkucharski opened this issue Oct 18, 2024 · 18 comments
Assignees

Comments

@adamkucharski
Copy link
Member

We have an initial example of {odin} implementation of the overlapping interventions vignette on the odin-default-model branch.

The next step will to add the relevant input checking and output formatting so model_default_odin() behaves the same as model_default() (i.e. could swap out in the vignette and user wouldn't notice if both compiled models provided).

@bahadzie
Copy link
Member

Input checking and output formatting for model_default_odin() are done. I've added test-model_default_odin.R a duplicate of test-model_default.R swapping out model_default() for model_default_odin() for all the test that don't generate a warning or error.

All the failing test are related to your slack comment about allowing vectorised inputs.

All commits are in the odin-default-model branch.

Below is the summary of devtools::test().

Passing tests

✅ Default odin model: basic expectations, scalar arguments
✅ Default odin model: statistical correctness, parameters
✅ Default odin model: contacts interventions and stats. correctness
✅ Default odin model: rate interventions
✅ Default odin model: errors and warnings, scalar arguments
✅ Default odin model: errors on vectorised input

Warning test

⚠️ Default model: vaccination and stats. correctness

DLSODA-  Warning..Internal T (=R1) and H (=R2) are
      such that in the machine, T + H = T on the next step  
     (H = step size). Solver will continue anyway.
In above message, R1 = 200, R2 = 4.47248e-15

Failing tests

🐛 Default model: time dependence

Failure (test-model_default_odin.R:295:3): Default model: time dependence
all(epidemic_size(data_baseline) > epidemic_size(data)) is not TRUE

`actual`:   FALSE
`expected`: TRUE

🐛 Default model: infection parameters as vectors

Failure (test-model_default_odin.R:423:3): Default model: infection parameters as vectors
Expected `model_default_odin(...)` to run without any conditions.
i Actually got a <simpleError> with text:
  Expected a scalar numeric for 'beta'

Error (test-model_default_odin.R:431:3): Default model: infection parameters as vectors
Error in `self$set_user(user = user, unused_user_action = unused_user_action)`: Expected a scalar numeric for 'beta'
Backtrace:
    ▆
 1. └─epidemics::model_default_odin(...) at test-model_default_odin.R:431:3
 2.   └─seirv_model$new(...) at epidemics/R/model_default.R:614:3
 3.     └─odin10ab6dad (local) initialize(...)
 4.       └─self$set_user(user = user, unused_user_action = unused_user_action) at filefb9c3b0b9ae8/R/odin.R:60:7

🐛 Default model: composable elements as lists

Failure (test-model_default_odin.R:483:3): Default model: composable elements as lists
Expected `model_default_odin(uk_population, intervention = npi_list)` to run without any conditions.
i Actually got a <simpleError> with text:
  Expected a scalar numeric for 'beta'

Error (test-model_default_odin.R:488:3): Default model: composable elements as lists
Error in `self$set_user(user = user, unused_user_action = unused_user_action)`: Expected a scalar numeric for 'beta'
Backtrace:
    ▆
 1. └─epidemics::model_default_odin(uk_population, intervention = npi_list) at test-model_default_odin.R:488:3
 2.   └─seirv_model$new(...) at epidemics/R/model_default.R:614:3
 3.     └─odin10ab6dad (local) initialize(...)
 4.       └─self$set_user(user = user, unused_user_action = unused_user_action) at filefb9c3b0b9ae8/R/odin.R:60:7

Failure (test-model_default_odin.R:536:3): Default model: multi-parameter, multi-composables
Expected `model_default_odin(...)` to run without any conditions.
i Actually got a <simpleError> with text:
  Expected a scalar numeric for 'beta'

🐛 Default model: multi-parameter, multi-composables

Failure (test-model_default_odin.R:536:3): Default model: multi-parameter, multi-composables
Expected `model_default_odin(...)` to run without any conditions.
i Actually got a <simpleError> with text:
  Expected a scalar numeric for 'beta'

Error (test-model_default_odin.R:545:3): Default model: multi-parameter, multi-composables
Error in `self$set_user(user = user, unused_user_action = unused_user_action)`: Expected a scalar numeric for 'beta'
Backtrace:
    ▆
 1. └─epidemics::model_default_odin(...) at test-model_default_odin.R:545:3
 2.   └─seirv_model$new(...) at epidemics/R/model_default.R:614:3
 3.     └─odin10ab6dad (local) initialize(...)
 4.       └─self$set_user(user = user, unused_user_action = unused_user_action) at filefb9c3b0b9ae8/R/odin.R:60:7

@bahadzie
Copy link
Member

epidemics_odin.Rmd is a clone of epidemics.Rmd that compares model_default() with model_default_odin(). The results are equal to within a tolerance of 0.4.

@adamkucharski
Copy link
Member Author

adamkucharski commented Oct 25, 2024

Nice work, thanks for putting together! Should be fine to close this issue now these have been added, once below point about potential mismatch clarified? Then can move follow up steps to future issues.

A few quick questions just to make sure I'm following the changes:

  • Does seirv_model in model_default.R compile using {odin} when {epidemics} is first installed? A useful next step for useability could be to precompile the model so it can be loaded with the package, like in the {seir} package (unless I've missed somewhere this is already happening).
  • Do you have an idea of the speed difference between the two models? Both should be much faster than pure R simulation, but useful to know what we're looking at. Update: I pushed a quick time comparison on epidemics_odin.R, and seems like odin is actually about 10x faster:
Original: 0.629
odin: 0.0612
  • Am I correct that the vectorisation in the original model_default() is handled with apply() and Map() functions in the R function, rather than in C++ backend? If so, this should be more straightforward to add this feature to model_default_odin() (can make separate issue for this).
  • We've now got the overlapping interventions vignette covered (and hence the 'getting started with interventions one too'), so it seems that as well as including uncertainty/scenario functionality, the modelling vaccination vignettte would be the next target for additional functionality if we want model_default_odin to be a full copy of model_default from a user point of view. Vaccination would be good one for @alxsrobert to input on, as he's already got several models coded here. Unless I'm missing something obvious we should be looking at first.

@adamkucharski
Copy link
Member Author

adamkucharski commented Oct 25, 2024

Also noticed that we seem to have a mismatch in the outputs now. E.g. below using the original model_default:
Screenshot 2024-10-25 at 13 23 10

And this when we replace model_default with model_default_odin:
Screenshot 2024-10-25 at 13 22 26

The two outputs match when using the odin model in the modelling_scenarios_odin.Rmd example (which has below output):

Screenshot 2024-10-25 at 13 24 30

Any thoughts on what might have changed? From a quick look, it may be to do with the contact matrix normalisation in the correct vignette odin model, which doesn't seem to appear in model_default_odin?

  contact_matrix_norm <- population$contact_matrix
  contact_matrix_norm <- (contact_matrix_norm / max(Re(eigen(contact_matrix_norm)$values))) /
  population$demography_vector

@bahadzie
Copy link
Member

Any thoughts on what might have changed? From a quick look, it may be to do with the contact matrix normalisation in the correct vignette odin model, which doesn't seem to appear in model_default_odin?

Contact matrix normalisation code is in .prepare_population() which is called from .check_prepare_args_default() which in turn is called from model_default() and model_default_odin() respectively.

I mirrored the first 122 lines of model_default() in model_default_odin() because

  • it deals with parameter checking and preparation
  • didn't want to miss edge cases
  • keeping it constant means differences are downstream (solvers/output formatting)

Another potential source of the difference can be in the way odin expects its parameters to be supplied.

@bahadzie
Copy link
Member

Does seirv_model in model_default.R compile using {odin} when {epidemics} is first installed? A useful next step for useability could be to precompile the model so it can be loaded with the package, like in the {seir} package (unless I've missed somewhere this is already happening).

I think from a user's perspective it'll compile on library() because in development it compiles on devtools::load_all() I don't think {seir} uses odin even though it is mention in README.md it doesn't appear in DESCRIPTION or NAMESPACE. They have a /src folder with cpp files which I suspect are generated using {dust} but {dust} is not currently on CRAN. The seirv model is simple enough to compile quickly on library() and after that it's cached.

Am I correct that the vectorisation in the original model_default() is handled with apply() and Map() functions in the R function, rather than in C++ backend? If so, this should be more straightforward to add this feature to model_default_odin() (can make separate issue for this).

Yes. My preference would be to rename this issue and keep all things model_default_odin together here.

@adamkucharski
Copy link
Member Author

adamkucharski commented Oct 27, 2024

Contact matrix normalisation code is in .prepare_population() which is called from .check_prepare_args_default() which in turn is called from model_default() and model_default_odin() respectively.

I mirrored the first 122 lines of model_default() in model_default_odin() because

* it deals with parameter checking and preparation

* didn't want to miss edge cases

* keeping it constant means differences are downstream (solvers/output formatting)

Another potential source of the difference can be in the way odin expects its parameters to be supplied.

I revisited the tolerance check in epidemics_odin.Rmd to understand where difference might be. This is the entry with maximum difference in the output values, which corresponds to a very large difference in the epidemic dynamics:

> output[4548,]
    time demography_group compartment    value
   <num>           <char>      <char>    <num>
1:   303              40+ susceptible 27816495
> output_odin[4548,]
    time demography_group compartment    value
   <num>           <char>      <char>    <num>
1:   303              40+ susceptible 16413209

Noticed there was a discrepancy in the default R0 for the two models (1.5 vs 1.3 - also relates to issue #253 on misdescribed default in another vignette). Have pushed a fix and now this returns all.equal(output, output_odin)=T. The multiple interventions vignette also now outputting correctly (see below). So maybe we need stricter checks on model output tolerances?

Screenshot 2024-10-26 at 17 26 20

@adamkucharski
Copy link
Member Author

adamkucharski commented Oct 27, 2024

I think from a user's perspective it'll compile on library() because in development it compiles on devtools::load_all() I don't think {seir} uses odin even though it is mention in README.md it doesn't appear in DESCRIPTION or NAMESPACE. They have a /src folder with cpp files which I suspect are generated using {dust} but {dust} is not currently on CRAN. The seirv model is simple enough to compile quickly on library() and after that it's cached.

Thanks. Yep, {seir} uses odin.dust to compile odin models to dust (e.g. SIR model). But agree that as only {odin} on CRAN, makes sense to focus on this. And having run with fresh install, compilation is nice and quick (actually, seems smoother than the original model_default C++ version).

Yes. My preference would be to rename this issue and keep all things model_default_odin together here.

Sounds good – shall we just rename title and create new to-do list in thread?

Yes please

@bahadzie
Copy link
Member

Any thoughts on how to fix the warning. I spent all day trying to debug it. No luck yet.

⚠️ Default model: vaccination and stats. correctness

DLSODA-  Warning..Internal T (=R1) and H (=R2) are
     such that in the machine, T + H = T on the next step  
    (H = step size). Solver will continue anyway.
In above message, R1 = 200, R2 = 4.47248e-15

The code below (adapted from the test file test-model_default_odin.R includes everything required to replicate it.

rm(list = ls())
devtools::load_all()

generate_population <- function() {
  # Prepare contact matrix and demography vector
  polymod <- socialmixr::polymod
  contact_data <- socialmixr::contact_matrix(
    polymod,
    countries = "United Kingdom",
    age.limits = c(0, 60),
    symmetric = TRUE
  )
  contact_matrix <- t(contact_data$matrix)
  demography_vector <- contact_data$demography$population

  # make initial conditions - order is important
  initial_conditions <- c(
    S = 1 - 1e-6, E = 0,
    I = 1e-6, R = 0, V = 0
  )
  initial_conditions <- rbind(
    initial_conditions,
    initial_conditions
  )

  # create a population
  population(
    name = "UK population",
    contact_matrix = contact_matrix,
    demography_vector = demography_vector,
    initial_conditions = initial_conditions
  )
}
uk_population <- generate_population()


data_baseline <- model_default(uk_population)
data_baseline_odin <- model_default_odin(uk_population)
all.equal(data_baseline, data_baseline_odin)
## [1] TRUE

all.equal(epidemic_size(data_baseline), epidemic_size(data_baseline_odin))
## [1] "Mean relative difference: 3.632693e-07"


single_vaccination <- vaccination(
  name = "double_vaccination",
  nu = matrix(1e-3, nrow = 2),
  time_begin = matrix(0, nrow = 2),
  time_end = matrix(100, nrow = 2)
)
data <- model_default(uk_population, vaccination = single_vaccination)
data_odin <- model_default_odin(uk_population, vaccination = single_vaccination)
all.equal(data, data_odin)
## [1] "Column 'value': Mean relative difference: 1.898998"

all.equal(epidemic_size(data), epidemic_size(data_odin))
## [1] "Mean relative difference: 0.9805965"



high_rate_vax <- vaccination(
  time_begin = matrix(200, nrow(uk_population$contact_matrix)),
  time_end = matrix(200 + 150, nrow(uk_population$contact_matrix)),
  nu = matrix(0.01, nrow = nrow(uk_population$contact_matrix))
)
data_high_rate_vax <- model_default(
  uk_population,
  vaccination = high_rate_vax, time_end = 600
)
data_high_rate_vax_odin <- model_default_odin(
  uk_population,
  vaccination = high_rate_vax, time_end = 600
)
## Warning messages:
## 1: In lsoda(y, times, func, parms, ...) :
##   an excessive amount of work (> maxsteps ) was done, but integration was not successful - increase maxsteps
## 2: In lsoda(y, times, func, parms, ...) :
##   Returning early. Results are accurate, as far as they go

c(nrow(data_high_rate_vax), nrow(data_high_rate_vax_odin))
## [1] 6010 2010

rows_to_compare <- seq_len(nrow(data_high_rate_vax_odin))

all.equal(
  data_high_rate_vax[rows_to_compare, ],
  data_high_rate_vax_odin[rows_to_compare, ]
)
## [1] "Column 'value': Mean relative difference: 1.666696e-05"

all.equal(
  epidemic_size(data_high_rate_vax[rows_to_compare, ]),
  epidemic_size(data_high_rate_vax_odin[rows_to_compare, ])
)
## [1] "Numeric: lengths (2, 0) differ"

I've also {microbenchmark}ed model_default(slow) v model_default_odin(fast).

microbenchmark::microbenchmark(
  model_default(uk_population),
  model_default_odin(uk_population),
  model_default(uk_population, vaccination = single_vaccination),
  model_default_odin(uk_population, vaccination = single_vaccination)
)
## Unit: milliseconds
##                                                                 expr      min
##                                         model_default(uk_population) 4.819167
##                                    model_default_odin(uk_population) 2.531334
##       model_default(uk_population, vaccination = single_vaccination) 4.622542
##  model_default_odin(uk_population, vaccination = single_vaccination) 2.493667
##        lq     mean   median       uq       max neval
##  4.944188 5.187382 5.014063 5.147542 12.064501   100
##  2.601313 3.541654 2.691938 2.832730 70.017876   100
##  4.787147 4.943179 4.842876 4.941834  8.213376   100
##  2.605500 2.850439 2.681106 2.799334  6.832918   100

@adamkucharski
Copy link
Member Author

Looks like that's happening because the underlying ODE solver has ended with a very small (~0) time step, likely as the result of a sudden change in dynamics that it's struggling to integrate – maybe from a large influx of vaccination? Strange that it doesn't seem to be an issue for the simple_vaccination example. Tagging @hillalex who may also have some thoughts.

@adamkucharski
Copy link
Member Author

I spotted the problem by checking the values of vax_nu that were getting passed to {odin} via model_default_odin. Turns out in .check_prepare_args_default, it scales the vaccination rate to give the number of people vaccinated (see below code), whereas in odin model, we're currently implementing as a per-capita vaccination rate (i.e. proportion of S group vaccinated per unit time).

# calculate vax_nu as the NUMBER of vaccinations per timestep
# rather than a fraction/rate. See issue #198 for more details.
vax_nu <- vax_nu * mod_args[["population"]][["demography_vector"]]

So {odin} returns an error because we're removing susceptible at an order of magnitude too high rate, turning the ODEs into a system that can't be accurately solved.

I've pushed an edit to the {odin} model that I think fixes this, i.e. swapping removal rate from vax_rate[i]*S[i] to min(vax_rate[i],S[i]).

@adamkucharski adamkucharski changed the title Add input valid to model_default_odin() Develop model_default_odin() for compatibility with vignettes Nov 2, 2024
@bahadzie
Copy link
Member

bahadzie commented Nov 4, 2024

I'm getting a test failure when I tried swapping out model_default() with model_default_odin(). I've pushed a commit to see if this is the same behavior for others. devtools::test() output below.

Failure (test-model_default_odin.R:260:3): Default model: vaccination and stats. correctness
Check on 'data[grepl("susceptible", data$compartment, fixed = TRUE), ]$value' failed: Element 661 is not >= 0
Backtrace:

  1. └─checkmate::expect_numeric(...) at test-model_default_odin.R:260:3
  2. └─checkmate::makeExpectation(x, res, info, label)

[ FAIL 1 | WARN 0 | SKIP 0 | PASS 627 ]

@bahadzie
Copy link
Member

bahadzie commented Nov 4, 2024

I've pushed an edit to the {odin} model that I think fixes this, i.e. swapping removal rate from vax_rate[i]*S[i] to min(vax_rate[i],S[i]).

Don't think I understand why min?

We could try passing what the odin model expects instead of what the C++ implementation expects?

@adamkucharski
Copy link
Member Author

adamkucharski commented Nov 4, 2024

Don't think I understand why min?

We could try passing what the odin model expects instead of what the C++ implementation expects?

The min is to ensure that we don't vaccinate more people than there are susceptibles remaining. Basically there are two main ways to implement vaccination: 1) vaccinate a relative % per day (i.e. vax_rate[i]*S[i]) or 2) vaccinate an absolute number per day (min(vax_rate[i],S[i]). Issue #198 points out that (2) is often more realistic, because we'll have a stock of vaccine we can distribute per day, rather than just aiming to reach a fixed % then stopping each day. So I'm happy to keep this new version for now - but can discuss possible extension options more with @alxsrobert

@adamkucharski
Copy link
Member Author

I'm getting a test failure when I tried swapping out model_default() with model_default_odin(). I've pushed a commit to see if this is the same behavior for others. devtools::test() output below.

Looks like the issue is a very small amount of overshoot in the solver because min() used (i.e. it removes a large number of vaccinated individuals from S before the solver catches up to fact S is now slightly smaller). So we get a value of -3.5e-7 at the tail end of the epidemic.

Using a scaling like in issue #198 may fix this, but could introduce some solver ineffeciencies.

@bahadzie
Copy link
Member

bahadzie commented Nov 6, 2024

In the latest commit model_default_odin() now works with vector parameters. devtools::test() now passes the test below.

beta <- rnorm(10, 1.3 / 7, sd = 0.01)
sigma <- rnorm(10, 0.5, sd = 0.01)
gamma <- rnorm(10, 1 / 7, sd = 0.01)

test_that("Default odin model: infection parameters as vectors", {

@adamkucharski
Copy link
Member Author

adamkucharski commented Nov 6, 2024

Thanks! I've run through the vignettes using a version of the package with the odin model (i.e. model_default = model_default_odin in model_default.R).
Here's a summary of checks on which ones ran and produced figures that are the same (based on visual check) as the original model vignettes:

Modelling interventions

  • modelling_interventions.Rmd
  • modelling_multiple_interventions.Rmd
  • modelling_rate_interventions.Rmd
    Two notes: 1) this wasn't running initially because the dimension of the intervention was transposed in model_default - once this transposition was removed, the vignette ran; 2) the vignette produced a different plot to the original Cpp model for the same reduction in transmission (i.e. intervention had greater impact in {odin} model). Looking more, this is because the current model_default doesn't specify which parameter is being targeted by the rate intervention - so it's passed to odin model to reduce contacts (like other contact interventions). Not sure why they don't give same answer as it should be multiplicative (beta*contacts in the model). But there may be a subtlety in the cpp implementation I'm missing.

Vaccination

  • modelling_vaccination.Rmd
    Note: This runs but vignette quite brief currently - should be addressed in tutorial, so could bring some of this plotting code for comparisons over to vignette. Also noting the numerical issues with modelling linear declines in S above

Time dependence

  • modelling_time_dependence.Rmd

Advanced modelling

  • modelling_param_uncertainty.Rmd
  • modelling_scenarios.Rmd Note: the code runs, but impact of mask interventions are not same as existing vignette (see notes about modelling_rate_interventions above)
  • finalsize_comparison.Rmd

Package vignettes

  • epidemics.Rmd
  • README.Rmd

Other observations

  • intervention() is an {epidemics} function but also used to define a variable inside the model (i.e. list of input interventions). Probably not an issue from a loaded package point of view, but caught me out while debugging.

@adamkucharski
Copy link
Member Author

I also noticed that the odin multiplicative interventions weren't consistent with the additive spec in the design principles (see issue #255) so have pushed a fix (which actually has the result of simplifying the odin code – although will still need a bit more thought for the modelling_rate_interventions component).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants