[New] Extending support for County level COVID-19 Impact Assessment #851

AnujTiwari · 2021-06-25T04:24:12Z

Hi All, Please help me to utilize the CovsirPhy for US County Level COVID-19 Impact Assessment.
My aim is to index all 3142 US counties as per the impact of COVID-19.

I can request the local health department for clinical datasets. Please help me to understand the following points:

What are the minimum parameters required to perform the county-level COVID-19 impact assessment using CovsirPhy.
If I can collect county-level vaccination and variant details, is it possible to utilize them with the current model.
What is the best output parameter (like R0) we can use for comparative evaluation of COIVD-19 impact and indexing counties.

I am very much new to the domain of compartmental modeling so, please help me to understand if I am leaving some important details required to extend the model for the county level.

Thanks

lisphilar · 2021-06-25T13:27:21Z

Thank you for creating issues!

My aim is to index all 3142 US counties as per the impact of COVID-19. I can request the local health department for clinical datasets.

"COVID-19 Hub" (our main data source for JHUData etc.) provides state-level records for US. You can confirm this as follows.

import covsirphy as cs
loader = cs.DataLoader()
jhu_data = loader.jhu()
jhu_data.layer(country="US").Province.unique()

If you have the county-level dataset as a CSV file in your local environment, you can create JHUData instance via CountryData class. Please refer to Use a local CSV file which has the number of cases and regard "county" as "province" when you are using covsirphy library.

What are the minimum parameters required to perform the county-level COVID-19 impact assessment using CovsirPhy.

If I understand this question correctly, you are asking about the required columns of the county-level dataset. You need the following data.

observation dates
county names
the number of confirmed cases
the number of fatal cases
the number of recovered cases: if you do not have this data, please set 0s. Full complement will be done.
population values

If I can collect county-level vaccination and variant details, is it possible to utilize them with the current model.

At the latest version 2.21.0, it is not implemented. We need some updates of classes.

What is the best output parameter (like R0) we can use for comparative evaluation of COIVD-19 impact and indexing counties.

Yes, we can use R0 as the index, but R0 is calculated with parameter values of ODE models. Please refer to Usage: SIR-derived models and I recommend to use the parameter values directly as the index.

AnujTiwari · 2021-06-25T21:54:14Z

Thank you lisphilar for clearing this up for me. Regarding vaccination and variant details, whether you are working on the updates? If I can include at least the vaccination dataset that will be a great success for me.

lisphilar · 2021-06-25T23:36:50Z

Could you provide the details of the dataset you have? I'd like to write a new class as the handler, testing it with the dataset, if available.

column names
some rows
file extension (CSV?)
if opened, URL

I plan to create new class LocalDataLoader tentative name) to handle datasets saved in local environment. Usage may be similar to DataLoaderclass.

lisphilar · 2021-06-27T10:14:20Z

Note for updating codes:
When we create LocalDataLoader class, we should add date_format argument because dates are sometimes parsed incorrectly as discussed in #856.

AnujTiwari · 2021-06-29T16:19:39Z

Could you provide the details of the dataset you have? I'd like to write a new class as the handler, testing it with the dataset, if available.

column names

some rows

file extension (CSV?)

if opened, URL

I plan to create new class LocalDataLoader tentative name) to handle datasets saved in local environment. Usage may be similar to DataLoaderclass.

Hi lisphilar I am working on collecting all the required datasets. Soon update them here for further processing.. Thanks.

AnujTiwari · 2021-07-04T17:18:28Z

Hi lisphilar, I am in the process of getting the dataset from local health departments. I apologize for the long delay. Meanwhile, please let me know how to export the intermediate maps (snl.trend().summary()) in high resolution .tif image format. Thank you

lisphilar · 2021-07-05T11:47:25Z

@AnujTiwari ,
Thank you for letting me know. With some pull requests linked to this issue, I'm preparing to add new features to DataLoader class (LocalDataLoader will not be created).

Meanwhile, please let me know how to export the intermediate maps (snl.trend().summary()) in high resolution .tif image format. Thank you

When we use CovsirPhy with Jupyter Notebook, you can export TIFF files as follows.

snl.interactive = False
snl.trend(filename="image_filename.tiff", dpi=300)

When we use it with Python scripts, snl.interactive = False is un-necessary.

snl.trend() (and all methods which create figures) calls matplotlib.pyplot.savefig() internally. Keyword arguments of matplotlib.pyplot.savefig(), including dpi=300, can be used as the keyword arguments of snl.trend(). If not, that may be a bug.

AnujTiwari · 2021-07-10T18:47:15Z

Hi lisphilar,
As per your suggestions, I have implemented cook county datasets (confirmed, fatal, population and recovered=0).
Everything went smooth and I was capable of generating results for cook county with S-R trend analysis phases. But I have some doubts:

My dataset is starting from 24Jan2020 but my S-R trend first phases was starting from 15Mar2020?
When I am trying to replace S-R trend analysis phases with some new change points following instructions at usage:phases, I am getting an error: ValueError: @start must be the same as/over 15Mar2020, but 27Feb2020 was applied. Am I missing something here?

Again, thanks a lot for your help. I am still waiting for getting health data from local health departments. Keep you updated with the developments, But definitely, I am getting vaccination information (%population received 1st and 2nd dose) and looking forward to incorporating vaccination in the reproduction estimation.

Dataset:
date | province | confirmed | fatal | recovered
1/24/2020 | Cook | 1 | 0 | 0
1/25/2020 | Cook | 1 | 0 | 0
.
.
2/18/2021 | Cook | 467986 | 9775 | 0
2/19/2021 | Cook | 468690 | 9790 | 0
2/20/2021 | Cook | 469379 | 9800 | 0

Code:
country_data = cs.CountryData("cook county.csv", country="Cook")
country_data.set_variables(
date="date", confirmed="confirmed", recovered="recovered", fatal="fatal")

country_data.register_total()
country_data.cleaned()

jhu_data_cook = cs.JHUData.from_dataframe(country_data.cleaned())
population_data_cook = cs.PopulationData()
population_data_cook.update(5150233, country="Cook")

snl = cs.Scenario(country="Cook")
snl.register(jhu_data_cook, population_data_cook)
snl.records()

snl.trend(algo="Pelt-rbf").summary()
snl.estimate(cs.SIRF)

S-R Phases:
0th phase (15Mar2020 - 24Apr2020): finished 281 trials in 0 min 9 sec
1st phase (25Apr2020 - 18May2020): finished 320 trials in 0 min 10 sec
2nd phase (19May2020 - 08Jul2020): finished 385 trials in 0 min 13 sec
3rd phase (09Jul2020 - 17Aug2020): finished 437 trials in 0 min 15 sec
4th phase (18Aug2020 - 23Sep2020): finished 321 trials in 0 min 10 sec
5th phase (24Sep2020 - 20Oct2020): finished 180 trials in 0 min 5 sec
6th phase (21Oct2020 - 04Nov2020): finished 239 trials in 0 min 7 sec
7th phase (05Nov2020 - 14Nov2020): finished 270 trials in 0 min 8 sec
8th phase (15Nov2020 - 24Nov2020): finished 389 trials in 0 min 13 sec
9th phase (25Nov2020 - 05Dec2020): finished 322 trials in 0 min 10 sec
12th phase (05Jan2021 - 20Jan2021): finished 181 trials in 0 min 5 sec
10th phase (06Dec2020 - 17Dec2020): finished 564 trials in 0 min 22 sec
11th phase (18Dec2020 - 04Jan2021): finished 348 trials in 0 min 11 sec
13th phase (21Jan2021 - 20Feb2021): finished 831 trials in 0 min 28 sec

For New Change Points:
snl.clear(include_past=True).summary()
snl.add(start_date="28Apr2020")

Error:
ValueError: @start must be the same as/over 15Mar2020, but 27Feb2020 was applied.

lisphilar · 2021-07-11T05:53:48Z

My dataset is starting from 24Jan2020 but my S-R trend first phases was starting from 15Mar2020?

Yes.
We cannot apply S-R trend analysis to records with R = 0. This is because we cannot calculate parameter "a" of log(S(R)) = -a*R + logN when R is always 0 e.g. (R, S) = (0, 1000), (0, 999), (0, 998)...(0, 980).

So, all records with R = 0 should be ignored when S-R trend analysis. However, zeros in your dataset are not actual values. Because you did not have actual values of R, zeros were registered as I proposed.

When R values appears not to be actual values, covsirphy.Scenario complements R values via covsirphy.JHUDataComplementHandler class internally. We can check complemented R values with snl.records(variables="R"). This returns a dataframe with dates and complemented R values except for zeros. I guess, minimum date of the returned dataframe was 15Mar2020.

When I am trying to replace S-R trend analysis phases with some new change points following instructions at usage:phases, I am getting an error: ValueError: @start must be the same as/over 15Mar2020, but 27Feb2020 was applied. Am I missing something here?

Please replace snl.add(start_date="28Apr2020") with snl.add(end_date="28Apr2020") to specify the end date, not start date, of the new phase.

AnujTiwari · 2021-07-11T06:24:49Z

Yes - Fatal dataset is actually starting from 17th Mar 2020. Thanks for clearing out this to me.
I tried snl.add(end_date="28Apr2020") also but I am receiving the same error.

code:
country_data = cs.CountryData("cook county.csv", country="Cook")
country_data.set_variables(
date="date", confirmed="confirmed", recovered="recovered", fatal="fatal")

country_data.register_total()
country_data.cleaned()

jhu_data_cook = cs.JHUData.from_dataframe(country_data.cleaned())
population_data_cook = cs.PopulationData()
population_data_cook.update(5150233, country="Cook")

snl = cs.Scenario(country="Cook")
snl.register(jhu_data_cook, population_data_cook)
snl.records()

snl.trend(algo="Pelt-rbf").summary()
snl.estimate(cs.SIRF)

snl.clear(include_past=True).summary()
snl.add(end_date="28Apr2020")
ValueError: @start must be the same as/over 15Mar2020, but 27Feb2020 was applied.

lisphilar · 2021-07-11T06:41:45Z

Hmm...This seems an internal error. The first date might not be changed correctly when zeros were removed.

What is the output of snl.first_date?
Does snl.timepoint(first_date="15Mar2020"); snl.add(end_date="28Apr2020") solve the problem?

AnujTiwari · 2021-07-11T06:48:35Z

Output of snl.first_date - 27Feb2020 - I am confused with this date as confirmed cases data is starting from 24th Jan 2020 and fatal cases data is starting from 17th Mar 2020..
snl.timepoints(first_date="15Mar2020"); snl.add(end_date="28Apr2020") is showing the error:
ValueError: @start must be the same as/over 01Apr2020, but 15Mar2020 was applied.

AnujTiwari · 2021-07-11T06:54:26Z

Cook County Data.zip
Please find the data for your reference.

lisphilar · 2021-07-11T07:26:55Z

Thank you for providing the file and I will try to find the cause.

lisphilar · 2021-07-20T14:32:36Z

Could you share the all script via gist or GitHub repository?

lisphilar · 2021-07-20T15:25:06Z

I tried codes with records as-of 18Jul2021.
https://gist.github.com/lisphilar/a4396dc91a25af30ccda0155be2c0214

None value of Rt:
The records in the phases fit to SIR-F model in-sufficiently and kappa was 0 in the two phases and sigma was 0 in the 1st phase. Because Rt is equal to "rho * (1 - theta) / (kappa + sigma)", None was returned. None was converted to "-" with snl.summary() and snl.get().

Error with "snl.estimate_accuracy(): I also confirmed the error when we run it without snl.trend()`. When some parameter values are 0, this error may be raised. Could you create a new issue for this with "Request fixing a bug" template?

AnujTiwari · 2021-07-20T15:41:19Z

In case when the records in the phase fit to SIR-F model in-sufficient and R0 is None '-', what do you suggest for getting some significant R0 value for the county - whether I should increase the time period and chk for 21 days instead of 14 days or something else ?

Also I am a little concerned about the very high R0 values like 13.59 (for Barbour county) after adding a new record (19th Day records in case of Barbour county). It seems like a small increase in COVID-19 cases/fatal (cases increased by count 2 in Barbour county) is causing a major shift in R0 value.

lisphilar · 2021-07-20T15:55:07Z

Because the phase does not fit to the model, I recommend to shorten the phase e.g. 7 days.

However, actually I need the outputs of snl.estimate_accuracy() for confirmation and deciding the number of days. This method compares simulated number with estimated parameter values and actual number of cases.

Thank you for creating #895 and I will investigate it with high priority after work on 21Jul2021 JST.

AnujTiwari · 2021-07-20T15:59:17Z

thank you for the suggestion. Yes- it will be great if it is possible to compute 'minimum number of the days' phase (from the last day) for which records in the phase sufficiently fit to SIR-F model and provide a significant R0 value.. Thanks

AnujTiwari · 2021-07-20T18:28:33Z

there are many counties where the number of cases and deaths is comparatively less model throws an error. I request you to please test the model with the Wilcox county dataset. Similar to many other counties, I am not able to get the outcome for this county.

import pandas as pd
loader = cs.DataLoader(update_interval=None)
loader.read_csv("Wilcox1.csv", parse_dates=["date"], dayfirst=False)

print(loader.local)

loader.assign(country="US", province="Wilcox", population=10373)

print(loader.local)
loader.lock(date="date", country="country", province="province", confirmed="confirmed", fatal="fatal", population="population")
print(loader.locked)

jhu_data = loader.jhu()

snl = cs.Scenario(country="US", province="Wilcox")
snl.register(jhu_data)
snl.records()

snl.clear(include_past=True)
snl.add(end_date="12Jul2021")
snl.add(end_date=snl.today)
snl.summary()

snl.estimate(cs.SIRF)
snl.summary()

ValueError: When the targets have multiple columns, we cannot select RMSLE.

Wilcox1.zip

lisphilar · 2021-07-21T00:14:53Z

Thank you for the notice, but it is difficult to track all problems here. Could you create separate issues for Babour and Wilcox problems?

Here we will discuss adding new methods to DataLoader and documentation of them. I will release a new stable version after closing this issue.

AnujTiwari · 2021-07-21T00:16:52Z

Sure, I will create seperate issues for these problems. Thanks

lisphilar · 2021-07-27T15:43:12Z

Dear @AnujTiwari ,
Could you review the updated documentation of the new DataLoader with following links? Version 2.22.0 is planned in Jul2021.
https://lisphilar.github.io/covid19-sir/markdown/LOADING.html

AnujTiwari · 2021-07-27T15:44:51Z

Sure I am working on it... Thanks

lisphilar · 2021-07-27T15:50:40Z

Thank you.

FYI.
Some tables are broken there and they may be fixed with a new commit in about twenty minutes (time for deployment).
23b9317

AnujTiwari added the enhancement New feature or request label Jun 25, 2021

lisphilar changed the title ~~Extending support for County level COVID-19 Impact Assessment~~ [New] Extending support for County level COVID-19 Impact Assessment Jun 25, 2021

lisphilar added brainstorming Discussion to get creative ideas question Further information is requested labels Jun 25, 2021

This was referenced Jun 26, 2021

Issue851 refactor #854

Merged

new: _LoaderBase.collect() #855

Merged

[New] Specify strftime format with CountryData.cleaned(date_format) when we use local dataset (Fix: Using Own Dataset Not Work Anymore) #856

Closed

lisphilar mentioned this issue Jun 29, 2021

Issue851 refactor covid19dh handler #864

Merged

This was referenced Jul 1, 2021

Issue851: read CSV files with DataLoader class #865

Merged

Issue851 use local data with DataLoader.jhu(), .population(), .oxcgrt(), .pcr() #868

Merged

Issue851 lock including COVID-19 Data Hub data #870

Merged

This was referenced Jul 6, 2021

Issue851 lock OWID coutry level vaccine data #873

Merged

[Docs] how to export high resolution TIFF images #874

Closed

Issue851 parse province-level vaccination data #876

Merged

Issue851 lock PCR data from OWID #877

Merged

lisphilar mentioned this issue Jul 11, 2021

[Fix] ValueError when adding phases manually to fully-complemented data #878

Closed

lisphilar added a commit that referenced this issue Jul 27, 2021

docs: #851

3d6aefc

lisphilar added a commit that referenced this issue Jul 27, 2021

docs: #851

1d24122

lisphilar added a commit that referenced this issue Jul 27, 2021

docs: adjust tables, #851

66e6c5a

lisphilar added a commit that referenced this issue Jul 27, 2021

adjust column numbers of tables, #851

23b9317

lisphilar added a commit that referenced this issue Jul 28, 2021

docs: fix broken tables, #851

3af4386

lisphilar added a commit that referenced this issue Jul 28, 2021

docs: workflow, #851

528218f

lisphilar added a commit that referenced this issue Jul 28, 2021

docs: revise doc titles, #851

789f2f7

lisphilar added a commit that referenced this issue Jul 28, 2021

docs: revise doc order, #851

4cc8462

lisphilar added a commit that referenced this issue Jul 28, 2021

docs: fix colons of table, #851

6634d18

lisphilar added a commit that referenced this issue Jul 28, 2021

config: use sphinx-markdown-table in dev, #851

f76206a

lisphilar added a commit that referenced this issue Jul 28, 2021

docs: use sphinx-markdown-tables, #851

76f270b

lisphilar added a commit that referenced this issue Jul 29, 2021

docs: #851, add explanation of skipping vars

c6e507d

lisphilar added a commit that referenced this issue Jul 29, 2021

docs: replace dates with date, #851

fd94b86

lisphilar closed this as completed Aug 13, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[New] Extending support for County level COVID-19 Impact Assessment #851

[New] Extending support for County level COVID-19 Impact Assessment #851

AnujTiwari commented Jun 25, 2021

lisphilar commented Jun 25, 2021

AnujTiwari commented Jun 25, 2021

lisphilar commented Jun 25, 2021

lisphilar commented Jun 27, 2021 •

edited

Loading

AnujTiwari commented Jun 29, 2021

AnujTiwari commented Jul 4, 2021 •

edited

Loading

lisphilar commented Jul 5, 2021

AnujTiwari commented Jul 10, 2021 •

edited

Loading

lisphilar commented Jul 11, 2021

AnujTiwari commented Jul 11, 2021 •

edited

Loading

lisphilar commented Jul 11, 2021

AnujTiwari commented Jul 11, 2021 •

edited

Loading

AnujTiwari commented Jul 11, 2021

lisphilar commented Jul 11, 2021

lisphilar commented Jul 20, 2021

lisphilar commented Jul 20, 2021

AnujTiwari commented Jul 20, 2021 •

edited

Loading

lisphilar commented Jul 20, 2021

AnujTiwari commented Jul 20, 2021 •

edited

Loading

AnujTiwari commented Jul 20, 2021 •

edited

Loading

lisphilar commented Jul 21, 2021

AnujTiwari commented Jul 21, 2021

lisphilar commented Jul 27, 2021

AnujTiwari commented Jul 27, 2021

lisphilar commented Jul 27, 2021

[New] Extending support for County level COVID-19 Impact Assessment #851

[New] Extending support for County level COVID-19 Impact Assessment #851

Comments

AnujTiwari commented Jun 25, 2021

lisphilar commented Jun 25, 2021

AnujTiwari commented Jun 25, 2021

lisphilar commented Jun 25, 2021

lisphilar commented Jun 27, 2021 • edited Loading

AnujTiwari commented Jun 29, 2021

AnujTiwari commented Jul 4, 2021 • edited Loading

lisphilar commented Jul 5, 2021

AnujTiwari commented Jul 10, 2021 • edited Loading

lisphilar commented Jul 11, 2021

AnujTiwari commented Jul 11, 2021 • edited Loading

lisphilar commented Jul 11, 2021

AnujTiwari commented Jul 11, 2021 • edited Loading

AnujTiwari commented Jul 11, 2021

lisphilar commented Jul 11, 2021

lisphilar commented Jul 20, 2021

lisphilar commented Jul 20, 2021

AnujTiwari commented Jul 20, 2021 • edited Loading

lisphilar commented Jul 20, 2021

AnujTiwari commented Jul 20, 2021 • edited Loading

AnujTiwari commented Jul 20, 2021 • edited Loading

lisphilar commented Jul 21, 2021

AnujTiwari commented Jul 21, 2021

lisphilar commented Jul 27, 2021

AnujTiwari commented Jul 27, 2021

lisphilar commented Jul 27, 2021

lisphilar commented Jun 27, 2021 •

edited

Loading

AnujTiwari commented Jul 4, 2021 •

edited

Loading

AnujTiwari commented Jul 10, 2021 •

edited

Loading

AnujTiwari commented Jul 11, 2021 •

edited

Loading

AnujTiwari commented Jul 11, 2021 •

edited

Loading

AnujTiwari commented Jul 20, 2021 •

edited

Loading

AnujTiwari commented Jul 20, 2021 •

edited

Loading

AnujTiwari commented Jul 20, 2021 •

edited

Loading