Skip to content

Commit

Permalink
Split school types (#385)
Browse files Browse the repository at this point in the history
  • Loading branch information
twallema authored Oct 20, 2023
1 parent dc22e9d commit 1dfc7fd
Show file tree
Hide file tree
Showing 48 changed files with 63,728 additions and 63,836 deletions.
2 changes: 1 addition & 1 deletion data/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -92,7 +92,7 @@ A folder containing all raw simulations used for the RESTORE reports, the report

#### Belgian Census 2011

+ `Pop_LPW_NL_25FEB15.xlsx` contains the working population per sex, place of residence and place of work. First, the raw spreadsheet `data/covid19_DTM/raw/census_2011/Pop_LPW_NL_25FEB15.xlsx` was modified in MS Excel and placed in the data/covid19_DTM/interim folder under the name `Pop_LPW_NL_25FEB15_delete_unknown.xlsx`. Data free for download at https://census2011.fgov.be/download/downloads_nl.html .
+ `Pop_LPW_NL_25FEB15.xlsx` contains the working population per sex, place of residence and place of work. First, the raw spreadsheet `data/covid19_DTM/raw/census_2011/Pop_LPW_NL_25FEB15.xlsx` was modified in MS Excel and placed in the data/covid19_DTM/interim folder under the name `Pop_LPW_NL_25FEB15_delete_unknown.xlsx`. Data free for download at https://census2011.fgov.be/download/downloads_nl.html.
+ `census_demo_nl_04nov14.xlsx` contains all demographic data from the 2011 Belgian census. From these data, the number of individuals in 10 year age-bins per Belgian arrondissement are calculated. The conversion notebook is `notebooks/0.1-twallema-extract-census-data.ipynb`. Data free for download at https://census2011.fgov.be/download/downloads_nl.html .
+ `census_arbeidsmarkt_nl_24oct14.xlsx` contains all work related data from the 2011 Belgian census. Data free for download at https://census2011.fgov.be/download/downloads_nl.html .

Expand Down

Large diffs are not rendered by default.

This file was deleted.

Large diffs are not rendered by default.

This file was deleted.

Large diffs are not rendered by default.

Binary file modified data/covid19_DTM/interim/sciensano/vacc_incidence_national.pkl
Binary file not shown.
Binary file modified data/covid19_DTM/interim/sciensano/vacc_incidence_prov.pkl
Binary file not shown.
126,264 changes: 63,132 additions & 63,132 deletions data/covid19_DTM/interim/sciensano/vacc_rescaling_values_prov.csv

Large diffs are not rendered by default.

Binary file modified data/covid19_DTM/interim/sciensano/vacc_rescaling_values_prov.pkl
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Original file line number Diff line number Diff line change
Expand Up @@ -11,9 +11,9 @@ Date,Lower CI,Upper CI,Seroprevalence
23 Jul 2020,0.035,0.069,0.052
5 Aug 2020,0.031,0.066,0.048
19 Aug 2020,0.042,0.079,0.06
2 Sept 2020,0.03,0.069,0.049
16 Sept 2020,0.028,0.062,0.044
30 Sept 2020,0.03,0.069,0.048
2 Sep 2020,0.03,0.069,0.049
16 Sep 2020,0.028,0.062,0.044
30 Sep 2020,0.03,0.069,0.048
14 Oct 2020,0.038,0.077,0.057
28 Oct 2020,0.068,0.12,0.092
11 Nov 2020,0.106,0.167,0.136
Expand All @@ -29,4 +29,4 @@ Date,Lower CI,Upper CI,Seroprevalence
25 May 2021,0.575,0.651,0.611
22 Jun 2021,0.759,0.815,0.787
20 Jul 2021,0.955,0.992,0.975
1 Aug 2021,,,
1 Aug 2021,,,
22 changes: 10 additions & 12 deletions environment.yml
Original file line number Diff line number Diff line change
@@ -1,19 +1,17 @@
name: COVID_MODEL
channels:
- defaults
- conda-forge
- defaults
dependencies:
- python=3.10
- jupyter
- python=3.11.4
- pandas
- numpy
- scipy
- numba
- xarray
- matplotlib
- numpy
- scipy
- xarray
- pandas
- emcee
- h5py
- corner
- geopandas
- zarr
- SAlib
- numba
- h5py
- tqdm
- click
29 changes: 15 additions & 14 deletions notebooks/calibration/calibrate_BASE-COVID19_SEIQRD_hybrid_vacc.py
Original file line number Diff line number Diff line change
@@ -1,10 +1,10 @@
"""
This script contains a calibration of national COVID-19 SEIQRD model to hospitalization data in Belgium.
Example function call
---------------------
Example
-------
python calibrate_BASE-COVID19_SEIQRD_hybrid_vacc.py -e 2021-08-23 -n_ag 10 -ID test
python calibrate_BASE-COVID19_SEIQRD_hybrid_vacc.py -e 2021-11-01 -n_ag 10 -ID test
"""

__author__ = "Tijs Alleman"
Expand All @@ -19,12 +19,12 @@
import ast
import click
import json
import datetime
import argparse
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import multiprocessing as mp
from datetime import date, datetime, timedelta
# COVID-19 code
from covid19_DTM.models.utils import initialize_COVID19_SEIQRD_hybrid_vacc
from covid19_DTM.data import sciensano
Expand Down Expand Up @@ -76,13 +76,15 @@
# Number of age groups used in the model
age_stratification_size=int(args.n_age_groups)
# Date at which script is started
run_date = str(datetime.date.today())
run_date = str(date.today())
# Keep track of runtime
initial_time = datetime.datetime.now()
initial_time = datetime.now()
# Start and end of calibration
start_calibration = pd.to_datetime(args.start_calibration)
start_calibration = datetime.strptime(args.start_calibration,"%Y-%m-%d")
if args.end_calibration:
end_calibration = pd.to_datetime(args.end_calibration)
end_calibration = datetime.strptime(args.end_calibration,"%Y-%m-%d")
# Leap size
tau = 0.50

##############################
## Define results locations ##
Expand Down Expand Up @@ -117,10 +119,9 @@
## Initialize the model ##
##########################

model, BASE_samples_dict, initN = initialize_COVID19_SEIQRD_hybrid_vacc(age_stratification_size=age_stratification_size, update_data=False, start_date=start_calibration.strftime("%Y-%m-%d"),
model, BASE_samples_dict, initN = initialize_COVID19_SEIQRD_hybrid_vacc(age_stratification_size=age_stratification_size, update_data=False, start_date=start_calibration,
stochastic=True, distinguish_day_type=True)

tau = 0.5
# Deterministic
model.parameters['beta'] = 0.027 # R0 = 3.31 --> https://pubmed.ncbi.nlm.nih.gov/32498136/
warmup = 0# 39 # Start 5 Feb. 2020: day of first detected COVID-19 infectee in Belgium
Expand Down Expand Up @@ -203,7 +204,7 @@
#out = pso.optimize(objective_function, bounds, kwargs={'simulation_kwargs':{'warmup': warmup}},
# swarmsize=multiplier_pso*processes, max_iter=n_pso, processes=processes, debug=True)[0]
# A good guess
theta = [0.50, 0.56, 5000, 1.40, 1.90, 0.225, 0.60]
theta = [0.50, 0.56, 5000, 1.45, 1.80, 0.225, 0.60]

# Nelder-mead
#step = len(bounds)*[0.05,]
Expand All @@ -220,10 +221,10 @@
# Assign estimate
model.parameters = assign_theta(model.parameters, pars, theta)
# Perform simulation
end_visualization = '2022-07-01'
out = model.sim([start_calibration, pd.Timestamp(end_visualization)], warmup=warmup)
end_visualization = datetime(2022, 7, 1)
out = model.sim([start_calibration, end_visualization], warmup=warmup)
# Visualize fit
ax = plot_PSO(out, data, states, start_calibration-pd.Timedelta(days=warmup), end_visualization)
ax = plot_PSO(out, data, states, start_calibration-timedelta(days=warmup), end_visualization)
plt.show()
plt.close()

Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
"""
This script contains a calibration of the spatial COVID-19 SEIQRD model to hospitalization data in Belgium during the period 2020-03-15 until 2021-10-07.
This script contains a calibration of the spatial COVID-19 SEIQRD model to hospitalization data in Belgium during the period 2020-03-15 until 2021-11-01.
"""

__author__ = " Tijs W. Alleman, Michiel Rollier"
Expand All @@ -21,6 +21,7 @@
import numpy as np
import matplotlib.pyplot as plt
import multiprocessing as mp
from datetime import date, datetime, timedelta
# COVID-19 code
from covid19_DTM.models.utils import initialize_COVID19_SEIQRD_spatial_hybrid_vacc
from covid19_DTM.data import sciensano
Expand Down Expand Up @@ -88,15 +89,17 @@
# Number of age groups used in the model
age_stratification_size=int(args.n_age_groups)
# Date at which script is started
run_date = str(datetime.date.today())
run_date = str(date.today())
# Keep track of runtime
initial_time = datetime.datetime.now()
initial_time = datetime.now()
# Show progress
progress=True
# Start and end of calibration
start_calibration = pd.to_datetime(args.start_calibration)
start_calibration = datetime.strptime(args.start_calibration,"%Y-%m-%d")
if args.end_calibration:
end_calibration = pd.to_datetime(args.end_calibration)
end_calibration = datetime.strptime(args.end_calibration,"%Y-%m-%d")
# Leap size
tau = 0.50

##############################
## Define results locations ##
Expand Down Expand Up @@ -136,10 +139,8 @@
## Initialize the model ##
##########################

model, BASE_samples_dict, initN = initialize_COVID19_SEIQRD_spatial_hybrid_vacc(age_stratification_size=age_stratification_size, agg=agg,
start_date=start_calibration.strftime("%Y-%m-%d"), stochastic=True,
distinguish_day_type=True)
tau = 0.50
model, BASE_samples_dict, initN = initialize_COVID19_SEIQRD_spatial_hybrid_vacc(age_stratification_size=age_stratification_size, agg=agg,start_date=start_calibration,
stochastic=True, distinguish_day_type=True)

if agg == 'arr':
# Switch to the provinicial initN
Expand Down Expand Up @@ -190,18 +191,20 @@
processes = int(os.getenv('SLURM_CPUS_ON_NODE', mp.cpu_count()/2))
multiplier_pso = 3
# MCMC settings
multiplier_mcmc = 10
multiplier_mcmc = 5
max_n = n_mcmc
print_n = 5
print_n = 10
# Define dataset
data=[df_hosp.loc[(slice(start_calibration,end_calibration), slice(None))],
df_hosp.loc[(slice(start_calibration,end_calibration), slice(None))].groupby(by=['date']).sum(),
df_sero_herzog['abs','mean'],
df_sero_sciensano['abs','mean'][:23],
df_cases]
states = ["H_in", "R", "R", "M_in"]
weights = np.array([1, 1, 1, 1])
log_likelihood_fnc = [ll_negative_binomial, ll_negative_binomial, ll_negative_binomial, ll_negative_binomial] # For arr calibration --> use poisson
states = ["H_in", "H_in", "R", "R", "M_in"]
weights = np.array([1, 1, 1, 1, 1])
log_likelihood_fnc = [ll_negative_binomial, ll_negative_binomial, ll_negative_binomial, ll_negative_binomial, ll_negative_binomial] # For arr calibration --> use poisson
log_likelihood_fnc_args = [dispersion_hosp.values,
dispersion_weighted_hosp,
dispersion_weighted_hosp,
dispersion_weighted_hosp,
dispersion_cases]
Expand Down Expand Up @@ -246,7 +249,7 @@
# out = pso.optimize(objective_function, bounds, kwargs={'simulation_kwargs':{'warmup': 0}},
# swarmsize=multiplier_pso*processes, maxiter=n_pso, processes=processes, debug=True)[0]
# A good guess
theta = [0.5, 0.56, 5000, 0.50, 0.4, 0.7, 1.4, 1.9, 0.225, 0.6]
theta = [0.5, 0.56, 5000, 0.45, 0.40, 0.70, 1.4, 1.9, 0.225, 0.6]

#######################################
## Visualize fits on multiple levels ##
Expand All @@ -257,13 +260,13 @@
print(theta)
pars_PSO = assign_theta(model.parameters, pars, theta)
model.parameters = pars_PSO
end_visualization = '2022-01-01'
end_visualization = datetime(2022, 1, 1)
# Perform simulation with best-fit results
out = model.sim([start_calibration, pd.Timestamp(end_visualization)], tau=tau)
out = model.sim([start_calibration, end_visualization], tau=tau)
# Aggregate Brussels and Brabant
out = aggregate_Brussels_Brabant_Dataset(out)
# National fit
data_star=[data[0].groupby(by=['date']).sum(), df_sero_herzog['abs','mean'], df_sero_sciensano['abs','mean'][:23], data[-1].groupby(by=['date']).sum(),]
data_star=[data[0].groupby(by=['date']).sum(), data[0].groupby(by=['date']).sum(), df_sero_herzog['abs','mean'], df_sero_sciensano['abs','mean'][:23], data[-1].groupby(by=['date']).sum(),]
ax = plot_PSO(out, data_star, states, start_calibration, end_visualization)
plt.show()
plt.close()
Expand Down Expand Up @@ -294,7 +297,7 @@
pars_PSO = assign_theta(model.parameters, pars, theta)
model.parameters = pars_PSO
# Perform simulation
out = model.sim([start_calibration, pd.Timestamp(end_visualization)], tau=tau)
out = model.sim([start_calibration, end_visualization], tau=tau)
# Aggregate Brussels and Brabant
out = aggregate_Brussels_Brabant_Dataset(out)
# Visualize national fit
Expand Down
2 changes: 1 addition & 1 deletion notebooks/calibration/emcee_sampler_to_dictionary.py
Original file line number Diff line number Diff line change
Expand Up @@ -152,7 +152,7 @@
smooth=1,
label_kwargs=dict(fontsize=24),
title_kwargs=dict(fontsize=14),
quantiles=[0.05, 0.95],
title_quantiles=[0.05, 0.50, 0.95],
levels=(1 - np.exp(-0.5), 1 - np.exp(-2), 1 - np.exp(-9 / 2.)),
plot_density=True,
plot_datapoints=False,
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@
"""

__author__ = "Tijs Alleman"
__copyright__ = "Copyright (c) 2022 by T.W. Alleman, BIOMATH, Ghent University. All Rights Reserved."
__copyright__ = "Copyright (c) 2023 by T.W. Alleman, BIOMATH, Ghent University. All Rights Reserved."

############################
## Load required packages ##
Expand All @@ -37,6 +37,7 @@
import argparse
import numpy as np
import pandas as pd
from datetime import datetime, timedelta
import matplotlib.pyplot as plt
from covid19_DTM.models.utils import initialize_COVID19_SEIQRD_hybrid_vacc
from covid19_DTM.data import sciensano
Expand Down Expand Up @@ -67,7 +68,7 @@
################################

# Start and end of simulation
end_sim = '2022-01-01'
end_sim = datetime(2022,1,1)
# Confidence level used to visualise model fit
conf_int = 0.05

Expand All @@ -91,10 +92,10 @@
samples_dict = load_samples_dict(samples_path+str(args.agg)+'_'+str(args.identifier) + '_SAMPLES_' + str(args.date) + '.json', age_stratification_size=age_stratification_size)
warmup = samples_dict['warmup']
# Start of calibration warmup and beta
start_calibration = samples_dict['start_calibration']
start_calibration = datetime.strptime(samples_dict['start_calibration'], '%Y-%m-%d')
start_sim = start_calibration
# Last datapoint used to calibrate warmup and beta
end_calibration = samples_dict['end_calibration']
end_calibration = datetime.strptime(samples_dict['end_calibration'], '%Y-%m-%d')
# Overdispersion data
dispersion = float(samples_dict['dispersion'])

Expand Down Expand Up @@ -160,7 +161,7 @@
ax2.plot(df_2plot['H_in','mean'], color='blue', linewidth=1.5)
ax2.fill_between(simtime, df_2plot['H_in','lower'], df_2plot['H_in','upper'],alpha=0.20, color = 'blue')
ax2.scatter(df_hosp[start_calibration:end_calibration].index,df_hosp['H_in'][start_calibration:end_calibration], color='red', alpha=0.2, linestyle='None', facecolors='red', s=10)
ax2.scatter(df_hosp[pd.to_datetime(end_calibration)+datetime.timedelta(days=1):end_sim].index,df_hosp['H_in'][pd.to_datetime(end_calibration)+datetime.timedelta(days=1):end_sim], color='black', alpha=0.2, linestyle='None', facecolors='black', s=10)
ax2.scatter(df_hosp[pd.to_datetime(end_calibration)+timedelta(days=1):end_sim].index,df_hosp['H_in'][pd.to_datetime(end_calibration)+timedelta(days=1):end_sim], color='black', alpha=0.2, linestyle='None', facecolors='black', s=10)
ax2 = _apply_tick_locator(ax2)
ax2.set_xlim(start_sim,end_sim)
ax2.set_ylabel('Incidence\nHospital (-)', fontsize=13)
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -39,10 +39,12 @@
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from datetime import datetime
from covid19_DTM.data import sciensano
from covid19_DTM.visualization.output import _apply_tick_locator
# Import the function to initialize the model
from covid19_DTM.models.utils import initialize_COVID19_SEIQRD_spatial_hybrid_vacc, output_to_visuals, add_negative_binomial

#############################
## Handle script arguments ##
#############################
Expand All @@ -66,8 +68,7 @@
################################

# Start and end of simulation
start_sim = '2020-09-01'
end_sim = '2022-01-01'
end_sim = datetime(2022,1,1)
# Confidence level used to visualise model fit
conf_int = 0.05

Expand All @@ -92,11 +93,12 @@
samples_dict = load_samples_dict(samples_path+str(args.agg)+'_'+str(args.identifier) + '_SAMPLES_' + str(args.date) + '.json', age_stratification_size=age_stratification_size)
warmup = float(samples_dict['warmup'])
dispersion = float(samples_dict['dispersion'])
tau = float(samples_dict['dispersion'])
tau = float(samples_dict['tau'])
# Start of calibration warmup and beta
start_calibration = samples_dict['start_calibration']
start_calibration = datetime.strptime(samples_dict['start_calibration'], '%Y-%m-%d')
start_sim = start_calibration
# Last datapoint used to calibrate warmup and beta
end_calibration = samples_dict['end_calibration']
end_calibration = datetime.strptime(samples_dict['end_calibration'], '%Y-%m-%d')

##################################################
## Load data not needed to initialize the model ##
Expand Down Expand Up @@ -138,7 +140,7 @@

print('\n1) Simulating spatial COVID-19 SEIRD '+str(args.n_samples)+' times')
start_sim = start_calibration
out = model.sim([start_sim,end_sim], warmup=warmup, N=args.n_samples, draw_function=draw_fnc, samples=samples_dict, processes=int(args.processes), tau=0.75)
out = model.sim([start_sim,end_sim], warmup=warmup, N=args.n_samples, draw_function=draw_fnc, samples=samples_dict, processes=int(args.processes), tau=tau)
# Aggregate Brussels and Brabant
out = aggregate_Brussels_Brabant_Dataset(out)
simtime = out['date'].values
Expand Down
Loading

0 comments on commit 1dfc7fd

Please sign in to comment.