[New] Specify strftime format with CountryData.cleaned(date_format) when we use local dataset (Fix: Using Own Dataset Not Work Anymore) #856

subi10 · 2021-06-27T04:55:29Z

Hi, im Subi from Malaysia, thank you very much for this outstanding package and for the last month i have been using the package to upload a dataset from a province in Malaysia and it work like charm. RIght now I try to do similiar step but the "scenario" instance return error look like it didnt read my datasetsets properly.

This is now

this is back then

am i doing something wrong?? this is how i do it.

lisphilar · 2021-06-27T06:09:46Z

Thank you for reaching out to us!
Could you check country_data.cleaned() has the all data you had in the CSV file? .head() is not used in In[10], but only five rows are shown in Out[10].

Additionally, please try auto_complement=False (skip automatic data complement) when creating Scenario instance. i.e. Please replace

my_scenario = cs.Scenario(jhu_data, population_data, "Malaysia", "Selangor")

with

my_scenario = cs.Scenario(jhu_data, population_data, "Malaysia", "Selangor", auto_complement=False)

If they do not work, is it possible to share the CSV file and version number of Python and CovsirPhy? (Kindly use "Request fixing a bug" issue template at the next time.)

subi10 · 2021-06-27T06:42:17Z

Hi, Thank you very much for your fast response. I am using the latest version as I am updating the package recently. As for the last time I forget which version enables me to get the desired output. I will try your suggestion and get back to you. Attached is the Csv contained the dataset stored locally. Again, thank you very much for the fast reply. Best regards,Subi On Sunday, June 27, 2021, 02:09:56 PM GMT+8, Hirokazu Takaya ***@***.***> wrote: Thank you for reaching out to us! Could you check country_data.cleaned() has the all data you had in the CSV file? .head() is not used in In[10], but only five rows are shown in Out[10]. Additionally, please try auto_complement=False (skip automatic data complement) when creating Scenario instance. i.e. Please replace my_scenario = cs.Scenario(jhu_data, population_data, "Malaysia", "Selangor") with my_scenario = cs.Scenario(jhu_data, population_data, "Malaysia", "Selangor", auto_complement=False) If they do not work, is it possible to share the CSV file and version number of Python and CovsirPhy? (Kindly use "Request fixing a bug" issue template at the next time.) — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

subi10 · 2021-06-27T06:54:32Z

Hi, I just tried the suggestion, and I get the below output country_data has all the datasets required. Best regards,Subi On Sunday, June 27, 2021, 02:09:56 PM GMT+8, Hirokazu Takaya ***@***.***> wrote: Thank you for reaching out to us! Could you check country_data.cleaned() has the all data you had in the CSV file? .head() is not used in In[10], but only five rows are shown in Out[10]. Additionally, please try auto_complement=False (skip automatic data complement) when creating Scenario instance. i.e. Please replace my_scenario = cs.Scenario(jhu_data, population_data, "Malaysia", "Selangor") with my_scenario = cs.Scenario(jhu_data, population_data, "Malaysia", "Selangor", auto_complement=False) If they do not work, is it possible to share the CSV file and version number of Python and CovsirPhy? (Kindly use "Request fixing a bug" issue template at the next time.) — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

lisphilar · 2021-06-27T07:03:41Z

Dear @subi10 ,
Thank you for your trying, but I missed images and CSVs because files are removed when we reply to GitHub Notification e-mails. Please return to GitHub Issues with your browser and attach them :-)
#856

lisphilar · 2021-06-27T07:07:20Z

You can move to GitHub Issues by clicking "view it on GitHub" link at the bottom of the notification e-mails.

subi10 · 2021-06-27T07:08:49Z

Hi,

Sorry for sending it via email. I did try the suggestion and I get this ,

attached is the file of the dataset im working with.
Selangor.xlsx

subi10 · 2021-06-27T07:10:27Z

Hi, Thank you for your guidance. I have made the comments on github and attached the said dataset. Hope that you received it. Best regards,Subi On Sunday, June 27, 2021, 03:07:30 PM GMT+8, Hirokazu Takaya ***@***.***> wrote: You can move to GitHub Issues by clicking "view it on GitHub" link at the bottom of the notification e-mails. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

lisphilar · 2021-06-27T07:41:17Z

Thank you for uploading!
Hmm...I tried it with CovsirPhy 2.21.0 (the latest stable version), CSV file converted from the Excel file you attached and Google Colab. Actually, it worked.
https://gist.github.com/lisphilar/e7697ae512bdb7220c4bccbf6c2beeb7

I noticed the first date of the records you showed in the first comment was 2020-01-05 and column names of the CSV file was "Confirmed", "Recovered" and "Death". However, the excel file I received has 2020-04-20 at the first record. Column names were "confirmed", "recovered" and "fatal".

subi10 · 2021-06-27T07:56:25Z

Hi,, Thank you for trying, however when i ran on my pc it still give the same result. Maybe due to environment error? I will try to use in other pc and get back to you. Best regards,Subi On Sunday, June 27, 2021, 03:41:27 PM GMT+8, Hirokazu Takaya ***@***.***> wrote: Thank you for uploading! Hmm...I tried it with CovsirPhy 2.21.0 (the latest stable version), CSV file converted from the Excel file you attached and Google Colab. Actually, it worked. https://gist.github.com/lisphilar/e7697ae512bdb7220c4bccbf6c2beeb7 I noticed the first date of the records you showed in the first comment was 2020-01-05 and column names of the CSV file was "Confirmed", "Recovered" and "Death". However, the excel file I received has 2020-04-20 at the first record. Column names were "confirmed", "recovered" and "fatal". — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

lisphilar · 2021-06-27T08:04:34Z

I’m not sure, but do you have CSV file and Excel file in the directory where codes were executed?
If so, please confirm the files have the same first date, 2020-04-20, and column names are ”confirmed", "recovered" and "fatal".

subi10 · 2021-06-27T08:05:43Z

Hi there, Thank you very much again, however, I tried to run in macbook using google collab the problem still persist. Best regards,Subi On Sunday, June 27, 2021, 03:41:27 PM GMT+8, Hirokazu Takaya ***@***.***> wrote: Thank you for uploading! Hmm...I tried it with CovsirPhy 2.21.0 (the latest stable version), CSV file converted from the Excel file you attached and Google Colab. Actually, it worked. https://gist.github.com/lisphilar/e7697ae512bdb7220c4bccbf6c2beeb7 I noticed the first date of the records you showed in the first comment was 2020-01-05 and column names of the CSV file was "Confirmed", "Recovered" and "Death". However, the excel file I received has 2020-04-20 at the first record. Column names were "confirmed", "recovered" and "fatal". — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

subi10 · 2021-06-27T08:12:52Z

Hi,

These are the files in my directory. I did change it to small letter thinking that it can fix my issue.

subi10 · 2021-06-27T08:15:22Z

Yes,

The first date is indeed 20-4-2020

subi10 · 2021-06-27T08:18:13Z

I still get this, it throw me -1 number

lisphilar · 2021-06-27T08:20:19Z

Could you share Selangor_S-R.ipynb?

subi10 · 2021-06-27T08:24:48Z

Hi, Thank you very much. Sure, please find attached the requested file. However, when I try to upload in Github the file cant be load at the comment section. Best regards,Subi On Sunday, June 27, 2021, 04:20:29 PM GMT+8, Hirokazu Takaya ***@***.***> wrote: Could you share Selangor_S-R.ipynb? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

subi10 · 2021-06-27T08:27:30Z

Hi,

I upload it here

https://github.com/subi10/Selangor

subi10 · 2021-06-27T08:36:08Z

Also, I pass the link of the collab u give to me to my colleague and ask to run it they also got the same error.

lisphilar · 2021-06-27T08:39:14Z

Thank you for creating the repository! Sorry for the trouble.

I noticed that the last date was "2021-06-14" in CSV and that was "2021-12-06" (future date!) in Out[4] (country_data.cleaned().tail()). I will investigate it with source codes.

Could you add the following lines to the script?

print(covsirphy.__version__)
country_data._raw.tail()

If the output of country_data._raw.tail() is not the same as Out[4], something is wrong with data cleaning.

subi10 · 2021-06-27T08:49:31Z

This is head and tail in the country_data

subi10 · 2021-06-27T08:52:58Z

This is the version i currently on

subi10 · 2021-06-27T08:55:29Z

Yes, look like the tail here has something issue with the last date.

lisphilar · 2021-06-27T08:56:55Z

Thank you for sharing. It appears that "12/6/2021" is converted to "2021-12-06" (=06Dec2021) in your PC.
Apart from CovsirPhy, please share the output of the next codes.

import pandas as pd
pd.to_datetime("12/6/2021")

My PC (in Japan) returns Timestamp('2021-12-06 00:00:00').

subi10 · 2021-06-27T08:57:19Z

Same goes here, my start date is April 20th 2020

subi10 · 2021-06-27T08:59:04Z

Oh, this is the timestamp

lisphilar · 2021-06-27T09:02:13Z

This is expected to be Timestamp('2021-06-12 00:00:00')...
To fix this issue, we may need to set time format appropriately.

import pandas as pd
pd.to_datetime("12/6/2021", format="%d/%m/%Y")

subi10 · 2021-06-27T09:05:42Z

I try to run it

lisphilar · 2021-06-27T09:08:08Z

The reason Google Colab successed is not clear...but, to test it, could you try the following?

import pandas as pd
# Remove cleaned data with wrong time format
country_data._cleaned_df = pd.DataFrame()
# Update raw dataframe with appropreate time format
country_data._raw["Date"] = pd.to_datetime(country_data._raw["Date"], format="%d/%m/%Y")
# Data cleaning
country_data.cleaned()

subi10 · 2021-06-27T09:09:51Z

Yes,, Now it worked. Thank you so much!! Best regards,Subi On Sunday, June 27, 2021, 05:02:24 PM GMT+8, Hirokazu Takaya ***@***.***> wrote: This is expected to be Timestamp('2021-06-12 00:00:00')... To fix this issue, we may need to set time format appropriately. import pandas as pd pd.to_datetime("12/6/2021", format="%d/%m/%Y") — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

subi10 · 2021-06-27T09:11:39Z

Great stuff!! Thank you so much. !!

lisphilar · 2021-06-27T09:14:35Z

Thank you for your cooperation!!!
I will add ~~time_format~~ date_format argument to CountryData.cleaned() later.

subi10 · 2021-06-27T09:15:27Z

Thank you very much to you as well. You are a genius and great person.

lisphilar · 2021-06-27T10:10:15Z

With #857, CountryData.cleaned(date_format=None) (default) was implemented at development 2.21.0-delta. This will be included in the next stable version 2.22.0 (planed in Jul2021). Becuase we only use "date", argument name is date_format, not time_format.

For a while, please use the code (country_data._raw["Date"] = pd.to_datetime(country_data._raw["Date"], format="%d/%m/%Y")) with the latest stable version.
Or, use country_data.cleaned(date_format="%d/%m/%Y") with the development version.

New documentation will be deployed in some hours.
https://lisphilar.github.io/covid19-sir/markdown/INSTALLATION.html#use-a-local-csv-file-which-has-the-number-of-cases

I will close this issue, thank you.

FYI:
With issue #851, LocalDataLoader may be created to read local datasets more easily. Date format should be considered there.

subi10 · 2021-06-27T10:29:05Z

Hi, Thank you very much for your effort! Best regards,Subhi On Sunday, June 27, 2021, 06:10:25 PM GMT+8, Hirokazu Takaya ***@***.***> wrote: With #857, CountryData.cleaned(date_format=None) (default) was implemented at development 2.21.0-delta. This will be included in the next stable version 2.22.0 (planed in Jul2021). Becuase we only use "date", argument name is date_format, not time_format. For a while, please use the code (country_data._raw["Date"] = pd.to_datetime(country_data._raw["Date"], format="%d/%m/%Y")) with the latest stable version. Or, use country_data.cleaned(date_format="%d/%m/%Y") with the development version. New documentation will be deployed in some hours. https://lisphilar.github.io/covid19-sir/markdown/INSTALLATION.html#use-a-local-csv-file-which-has-the-number-of-cases I will close this issue, thank you. FYI: With issue #851, LocalDataLoader may be created to read local datasets more easily. Date format should be considered there. — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

lisphilar · 2021-07-31T13:41:22Z

Dear @subi10,
Hello again.
Stable version 2.22.0 was released and DataLoader class was improved.
https://lisphilar.github.io/covid19-sir/markdown/LOADING.html

With 2.22.0, we can use DataLoader to read local CSV files without CountryData.

import covsirphy as cs
loader = cs.DataLoader(update_interval=None)
loader.read_csv("Selangor.csv", parse_dates=["Date"], dayfirst=True)
# loader.local
loader.assign(country="Malaysia", state="Selangor", population=6_530_000)
loader.lock(
    date="Date", country="country", province="state",
    confirmed="confirmed", fatal="fatal", recovered="recovered", population="population")
# loader.locked
jhu_data = loader.jhu()
snl = cs.Scenario(country="Malaysia", province="Selangor")
snl.register(jhu_data)
snl.records()

subi10 · 2021-08-28T15:28:41Z

Dear Hirokazu Takaya, I just got the access to my mail and see this!, Thank you very much for your effort to solve the issue, really appreciate it. I just been trying to find how to use the package to get the model of vaccination and reinfection, If you can show me the quickest walk through on how can I do that. Malaysian government releases the dataset on the vaccination and how may I use this to incorporate it into the model. Really interested to find out. I saw there is two models can fit my need but is there any way to use SIRD with vaccination and can you show the simplest walkthrough (the example of python code), I try before with no luck. Thank you so much for everything. Best wishes,Subhi covid19-public/epidemic at main · MoH-Malaysia/covid19-public | | | | | | | | | | | covid19-public/epidemic at main · MoH-Malaysia/covid19-public Official data on the COVID-19 epidemic in Malaysia. Powered by CPRC, CPRC Hospital System, MKAK, and MySejahtera... | | | On Saturday, July 31, 2021, 09:41:35 PM GMT+8, Hirokazu Takaya ***@***.***> wrote: Dear @subi10, Hello again. Stable version 2.22.0 was released and DataLoader class was improved. https://lisphilar.github.io/covid19-sir/markdown/LOADING.html With 2.22.0, we can use DataLoader to read local CSV files without CountyData. import covsirphy as cs loader = cs.DataLoader(update_interval=None) loader.read_csv("Selangor.csv", parse_dates=["Date"], dayfirst=True) # loader.local loader.assign(country="Malaysia", state="Selangor", population=6_530_000) loader.lock( date="Date", country="country", province="state", confirmed="confirmed", fatal="fatal", recovered="recovered", population="population") # loader.locked jhu_data = loader.jhu() snl = cs.Scenario(country="Malaysia", province="Selangor") snl.register(jhu_data) snl.records() — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or unsubscribe.

lisphilar changed the title ~~Using Own Dataset Not Work Anymore~~ [FIx] Using Own Dataset Not Work Anymore Jun 27, 2021

lisphilar added this to the Release v2.22.0 milestone Jun 27, 2021

lisphilar added the bug Something isn't working label Jun 27, 2021

lisphilar changed the title ~~[FIx] Using Own Dataset Not Work Anymore~~ [New] Specify strftime format with CountryData.cleaned(date_format) when we use local dataset (Fix: Using Own Dataset Not Work Anymore) Jun 27, 2021

lisphilar added enhancement New feature or request bug Something isn't working documentation Improvements or additions to documentation and removed bug Something isn't working labels Jun 27, 2021

lisphilar mentioned this issue Jun 27, 2021

[New] Specify strftime format with CountryData.cleaned(date_format) #857

Merged

lisphilar closed this as completed Jun 27, 2021

lisphilar mentioned this issue Jun 27, 2021

[New] Extending support for County level COVID-19 Impact Assessment #851

Closed

[New] Specify strftime format with CountryData.cleaned(date_format) when we use local dataset (Fix: Using Own Dataset Not Work Anymore) #856

[New] Specify strftime format with CountryData.cleaned(date_format) when we use local dataset (Fix: Using Own Dataset Not Work Anymore) #856

Comments

subi10 commented Jun 27, 2021

lisphilar commented Jun 27, 2021

subi10 commented Jun 27, 2021 via email

subi10 commented Jun 27, 2021 via email

lisphilar commented Jun 27, 2021

lisphilar commented Jun 27, 2021

subi10 commented Jun 27, 2021

subi10 commented Jun 27, 2021 via email

lisphilar commented Jun 27, 2021

subi10 commented Jun 27, 2021 via email

lisphilar commented Jun 27, 2021 • edited Loading

subi10 commented Jun 27, 2021 via email

subi10 commented Jun 27, 2021

subi10 commented Jun 27, 2021

subi10 commented Jun 27, 2021

lisphilar commented Jun 27, 2021

subi10 commented Jun 27, 2021 via email

subi10 commented Jun 27, 2021

subi10 commented Jun 27, 2021

lisphilar commented Jun 27, 2021

subi10 commented Jun 27, 2021

subi10 commented Jun 27, 2021

subi10 commented Jun 27, 2021

lisphilar commented Jun 27, 2021

subi10 commented Jun 27, 2021

subi10 commented Jun 27, 2021

lisphilar commented Jun 27, 2021

subi10 commented Jun 27, 2021

lisphilar commented Jun 27, 2021 • edited Loading

subi10 commented Jun 27, 2021 via email

subi10 commented Jun 27, 2021

lisphilar commented Jun 27, 2021 • edited Loading

subi10 commented Jun 27, 2021

lisphilar commented Jun 27, 2021

subi10 commented Jun 27, 2021 via email

lisphilar commented Jul 31, 2021 • edited Loading

subi10 commented Aug 28, 2021 via email

lisphilar commented Jun 27, 2021 •

edited

Loading

lisphilar commented Jun 27, 2021 •

edited

Loading

lisphilar commented Jun 27, 2021 •

edited

Loading

lisphilar commented Jul 31, 2021 •

edited

Loading