Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ Jinja whitespaces and newlines #3657

Merged
merged 3 commits into from
Dec 2, 2024
Merged

✨ Jinja whitespaces and newlines #3657

merged 3 commits into from
Dec 2, 2024

Conversation

Marigold
Copy link
Collaborator

@Marigold Marigold commented Nov 29, 2024

A couple of improvements to alleviate some pain from #3654.

  • Added instructions how to preview metadata with dimensions in a notebook to avoid long build-wait-check cycles
  • Strip whitespaces from Jinja blocks
  • Improve ETL docs

Note that this is still not perfect and doesn't solve \n. problem from the issue (we still need to use <%- elif to get rid of it). Ideally, we should never have to use - and have the Jinja templates as intuitive as possible. To do this properly, we should:

  1. Replace dynamic-yaml by something simpler and unify saving & loading of metadata files (while keeping it fast, there were tons of performance optimizations)
  2. Move jinja functionality to owid-catalog and add method VariableMeta.render_jinja(dim_dict={"..."})
  3. Add validation for double whitespaces, newlines, \n., etc.

@owidbot
Copy link
Contributor

owidbot commented Nov 29, 2024

Quick links (staging server):

Site Dev Site Preview Admin Wizard Docs

Login: ssh owid@staging-site-jinja-previews

chart-diff: ✅ No charts for review.
data-diff: ❌ Found differences
~ Dataset garden/antibiotics/2024-11-12/antimicrobial_usage
-   - update_period_days: 365
    ?                      ^^
+   + update_period_days: 308
    ?                      ^^
- - Table class_aggregated
-   - Column ddd_anti_malarials
-   - Column ddd_antibacterials_and_antituberculosis
-   - Column ddd_antifungals
-   - Column ddd_antituberculosis
-   - Column ddd_antivirals
-   - Column did_anti_malarials
-   - Column did_antibacterials_and_antituberculosis
-   - Column did_antifungals
-   - Column did_antituberculosis
-   - Column did_antivirals
  = Table aware
    ~ Column ddd (changed metadata)
-       -   Total (#dod:defined-daily-doses) of AWaRe category: << awarelabel >> antibiotics used in a given year. <% if aware == "A" %> Access antibiotics have activity against a wide range of common pathogens and show lower resistance potential than antibiotics in the other groups. <% elif aware == "W" %> Watch antibiotic have higher resistance potential and include most of the highest priority agents among the Critically Important Antimicrobials for Human Medicine and/or antibiotics that are at relatively high risk of bacterial resistance. <% elif aware == "R" %> Reserve antibiotics  should be reserved for treatment of confirmed or suspected infections due to multi-drug-resistant organisms. Reserve group antibiotics should be treated as “last resort” options. <% elif aware == "O" %> The use of the Not classified/Not recommended antibiotics is not evidence-based, nor recommended in high-quality international guidelines.  WHO does not recommend the use of these antibiotics in clinical practice. <% endif %>
        ?   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+       +   Volume of AWaRe category: << awarelabel >> antibiotics used in a given year. <% if aware == "A" %> Access antibiotics have activity against a wide range of common pathogens and show lower resistance potential than antibiotics in the other groups. <% elif aware == "W" %> Watch antibiotic have higher resistance potential and include most of the highest priority agents among the Critically Important Antimicrobials for Human Medicine and/or antibiotics that are at relatively high risk of bacterial resistance. <% elif aware == "R" %> Reserve antibiotics  should be reserved for treatment of confirmed or suspected infections due to multi-drug-resistant organisms. Reserve group antibiotics should be treated as “last resort” options. <% elif aware == "O" %> The use of the Not classified/Not recommended antibiotics is not evidence-based, nor recommended in high-quality international guidelines.  WHO does not recommend the use of these antibiotics in clinical practice. <% endif %>
        ?   ^^^^^^^^^
-       - display:
-       -   numDecimalPlaces: 0
    ~ Column did (changed metadata)
-       -   Total (#dod:defined-daily-doses) of AWaRe category: <<awarelabel>> used per 1000 inhabitants per day. <% if aware == "A" %> Access antibiotics have activity against a wide range of common pathogens and show lower resistance potential than antibiotics in the other groups. <% elif aware == "W" %> Watch antibiotic have higher resistance potential and include most of the highest priority agents among the Critically Important Antimicrobials for Human Medicine and/or antibiotics that are at relatively high risk of bacterial resistance. <% elif aware == "R" %> Reserve antibiotics  should be reserved for treatment of confirmed or suspected infections due to multi-drug-resistant organisms. Reserve group antibiotics should be treated as “last resort” options. <% elif aware == "O" %> The use of the Not classified/Not recommended antibiotics is not evidence-based, nor recommended in high-quality international guidelines.  WHO does not recommend the use of these antibiotics in clinical practice. <% endif %>
        ?   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+       +   Volume of AWaRe category: <<awarelabel>> used per 1000 inhabitants per day. <% if aware == "A" %> Access antibiotics have activity against a wide range of common pathogens and show lower resistance potential than antibiotics in the other groups. <% elif aware == "W" %> Watch antibiotic have higher resistance potential and include most of the highest priority agents among the Critically Important Antimicrobials for Human Medicine and/or antibiotics that are at relatively high risk of bacterial resistance. <% elif aware == "R" %> Reserve antibiotics  should be reserved for treatment of confirmed or suspected infections due to multi-drug-resistant organisms. Reserve group antibiotics should be treated as “last resort” options. <% elif aware == "O" %> The use of the Not classified/Not recommended antibiotics is not evidence-based, nor recommended in high-quality international guidelines.  WHO does not recommend the use of these antibiotics in clinical practice. <% endif %>
        ?   ^^^^^^^^^
-       - display:
-       -   numDecimalPlaces: 1
  = Table class
    ~ Column ddd (changed metadata)
-       -   Defined daily doses of <% if routeofadministration == "O" %> orally administered <% elif routeofadministration == "P" %> parentearally administered <% elif routeofadministration == "R" %> rectally administered4 <% elif routeofadministration == "I" %> inhaled <% endif %>  << antimicrobialclass.lower()>> - << atc4name.lower() >>  used
        ?                                                                                                                                                                                                                                                                                                                ^^^^^^^^^^^^^^^^^^^^^^^^^^^
+       +   Defined daily doses of <% if routeofadministration == "O" %> orally administered <% elif routeofadministration == "P" %> parentearally administered <% elif routeofadministration == "R" %> rectally administered4 <% elif routeofadministration == "I" %> inhaled <% endif %>  << antimicrobialclass>> - << atc4name.lower() >>  used
        ?                                                                                                                                                                                                                                                                                                        ++++++++++++++++        ^^^
-       - description_short: Total (#dod:defined-daily-doses) of antimicrobials used in a given year.
+       + description_short: Volume of antimicrobials used in a given year.
-       - display:
-       -   numDecimalPlaces: 0
    ~ Column did (changed metadata)
-       - description_short: Total (#dod:defined-daily-doses) of antimicrobials used per 1000 inhabitants per day.
+       + description_short: Volume of antimicrobials used per 1000 inhabitants per day.
-       - display:
-       -   numDecimalPlaces: 1
= Dataset garden/artificial_intelligence/2024-11-03/epoch_regressions
  = Table epoch_regressions
    ~ Dim days_since_1949
+       + New values: 12 / 874 (1.37%)
             system  days_since_1949
          1.5x/year              547
          1.5x/year            22080
          2.0x/year            22280
          2.4x/year            27670
          4.1x/year            27670
-       - Removed values: 12 / 874 (1.37%)
                               system  days_since_1949
          1.5x/year between 1950–2010              547
          1.5x/year between 1950–2010            22080
          2.0x/year between 2010–2025            22280
          2.4x/year between 2010–2025            27670
          4.1x/year between 2010–2025            27670
    ~ Dim system
+       + New values: 12 / 874 (1.37%)
           days_since_1949    system
                       547 1.5x/year
                     22080 1.5x/year
                     22280 2.0x/year
                     27670 2.4x/year
                     27670 4.1x/year
-       - Removed values: 12 / 874 (1.37%)
           days_since_1949                      system
                       547 1.5x/year between 1950–2010
                     22080 1.5x/year between 1950–2010
                     22280 2.0x/year between 2010–2025
                     27670 2.4x/year between 2010–2025
                     27670 4.1x/year between 2010–2025
    ~ Column domain (new data)
+       + New values: 12 / 874 (1.37%)
           days_since_1949    system domain
                       547 1.5x/year    NaN
                     22080 1.5x/year    NaN
                     22280 2.0x/year    NaN
                     27670 2.4x/year    NaN
                     27670 4.1x/year    NaN
-       - Removed values: 12 / 874 (1.37%)
           days_since_1949                      system domain
                       547 1.5x/year between 1950–2010    NaN
                     22080 1.5x/year between 1950–2010    NaN
                     22280 2.0x/year between 2010–2025    NaN
                     27670 2.4x/year between 2010–2025    NaN
                     27670 4.1x/year between 2010–2025    NaN
    ~ Column organization_categorization (new data)
+       + New values: 12 / 874 (1.37%)
           days_since_1949    system organization_categorization
                       547 1.5x/year                         NaN
                     22080 1.5x/year                         NaN
                     22280 2.0x/year                         NaN
                     27670 2.4x/year                         NaN
                     27670 4.1x/year                         NaN
-       - Removed values: 12 / 874 (1.37%)
           days_since_1949                      system organization_categorization
                       547 1.5x/year between 1950–2010                         NaN
                     22080 1.5x/year between 1950–2010                         NaN
                     22280 2.0x/year between 2010–2025                         NaN
                     27670 2.4x/year between 2010–2025                         NaN
                     27670 4.1x/year between 2010–2025                         NaN
    ~ Column parameters (new data, changed data)
+       + New values: 12 / 874 (1.37%)
           days_since_1949    system   parameters
                       547 1.5x/year         <NA>
                     22080 1.5x/year         <NA>
                     22280 2.0x/year  577927.0625
                     27670 2.4x/year         <NA>
                     27670 4.1x/year         <NA>
-       - Removed values: 12 / 874 (1.37%)
           days_since_1949                      system   parameters
                       547 1.5x/year between 1950–2010         <NA>
                     22080 1.5x/year between 1950–2010         <NA>
                     22280 2.0x/year between 2010–2025  577927.0625
                     27670 2.4x/year between 2010–2025         <NA>
                     27670 4.1x/year between 2010–2025         <NA>
    ~ Column publication_date (new data)
+       + New values: 12 / 874 (1.37%)
           days_since_1949    system publication_date
                       547 1.5x/year              NaT
                     22080 1.5x/year              NaT
                     22280 2.0x/year              NaT
                     27670 2.4x/year              NaT
                     27670 4.1x/year              NaT
-       - Removed values: 12 / 874 (1.37%)
           days_since_1949                      system publication_date
                       547 1.5x/year between 1950–2010              NaT
                     22080 1.5x/year between 1950–2010              NaT
                     22280 2.0x/year between 2010–2025              NaT
                     27670 2.4x/year between 2010–2025              NaT
                     27670 4.1x/year between 2010–2025              NaT
    ~ Column training_computation_petaflop (new data, changed data)
+       + New values: 12 / 874 (1.37%)
           days_since_1949    system  training_computation_petaflop
                       547 1.5x/year                            0.0
                     22080 1.5x/year                       0.246361
                     22280 2.0x/year                           <NA>
                     27670 2.4x/year                           <NA>
                     27670 4.1x/year                    466953248.0
-       - Removed values: 12 / 874 (1.37%)
           days_since_1949                      system  training_computation_petaflop
                       547 1.5x/year between 1950–2010                            0.0
                     22080 1.5x/year between 1950–2010                       0.246361
                     22280 2.0x/year between 2010–2025                           <NA>
                     27670 2.4x/year between 2010–2025                           <NA>
                     27670 4.1x/year between 2010–2025                    466952832.0
    ~ Column training_dataset_size__datapoints (new data, changed data)
+       + New values: 12 / 874 (1.37%)
           days_since_1949    system  training_dataset_size__datapoints
                       547 1.5x/year                               <NA>
                     22080 1.5x/year                               <NA>
                     22280 2.0x/year                               <NA>
                     27670 2.4x/year                      60797116416.0
                     27670 4.1x/year                               <NA>
-       - Removed values: 12 / 874 (1.37%)
           days_since_1949                      system  training_dataset_size__datapoints
                       547 1.5x/year between 1950–2010                               <NA>
                     22080 1.5x/year between 1950–2010                               <NA>
                     22280 2.0x/year between 2010–2025                               <NA>
                     27670 2.4x/year between 2010–2025                      60797116416.0
                     27670 4.1x/year between 2010–2025                               <NA>
= Dataset garden/demography/2023-06-27/world_population_comparison
  = Table world_population_comparison
= Dataset garden/missing_data/2024-03-26/children_out_of_school
  = Table children_out_of_school
= Dataset garden/wb/2024-06-10/gender_statistics
  = Table gender_statistics
    ~ Column sg_law_eqrm_wk (changed data)
        ~ Changed values: 24 / 15124 (0.16%)
           country  year  sg_law_eqrm_wk -  sg_law_eqrm_wk +
          Slovakia  1980               1.0               0.0
          Slovakia  1981               1.0               0.0
          Slovakia  1984               1.0               0.0
          Slovakia  1990               1.0               0.0
          Slovakia  1992               1.0               0.0
= Dataset garden/wb/2024-06-10/gender_statistics_country_counts
  = Table gender_statistics
    ~ Column sg_law_eqrm_wk_no_count (changed data)
        ~ Changed values: 48 / 378 (12.70%)
          country  year  sg_law_eqrm_wk_no_count -  sg_law_eqrm_wk_no_count +
           Europe  1974                         38                         39
            World  1972                        186                        187
            World  1975                        181                        182
            World  1976                        180                        181
            World  1978                        178                        179
    ~ Column sg_law_eqrm_wk_no_pop (changed data)
        ~ Changed values: 48 / 378 (12.70%)
          country  year  sg_law_eqrm_wk_no_pop -  sg_law_eqrm_wk_no_pop +
           Europe  1974                550689102                555379750
            World  1972               3728735983               3733335785
            World  1975               3877380047               3882119222
            World  1976               3923263849               3928052362
            World  1978               4011155699               4016041432
    ~ Column sg_law_eqrm_wk_yes_count (changed data)
        ~ Changed values: 48 / 378 (12.70%)
          country  year  sg_law_eqrm_wk_yes_count -  sg_law_eqrm_wk_yes_count +
           Europe  1974                           4                           3
            World  1972                           3                           2
            World  1975                           8                           7
            World  1976                           9                           8
            World  1978                          11                          10
    ~ Column sg_law_eqrm_wk_yes_pop (changed data)
        ~ Changed values: 48 / 378 (12.70%)
          country  year  sg_law_eqrm_wk_yes_pop -  sg_law_eqrm_wk_yes_pop +
           Europe  1974                 123385892                 118695244
            World  1972                  70537112                  65937310
            World  1975                 145207798                 140468623
            World  1976                 172060729                 167272216
            World  1978                 230445466                 225559733
= Dataset garden/who/2024-09-09/flu_test
  = Table flu_test
    ~ Dim country
+       + New values: 7 / 72382 (0.01%)
                date country
          2024-08-26  Brazil
          2024-09-02  Brazil
          2024-09-09  Brazil
          2024-09-16  Brazil
          2024-11-11  Mexico
-       - Removed values: 36 / 72382 (0.05%)
                date  country
          2024-11-18 Ethiopia
          2024-11-11     Iraq
          2024-11-18  Jamaica
          2024-10-21   Uganda
          2024-10-28   Uganda
    ~ Dim date
+       + New values: 7 / 72382 (0.01%)
          country       date
           Brazil 2024-08-26
           Brazil 2024-09-02
           Brazil 2024-09-09
           Brazil 2024-09-16
           Mexico 2024-11-11
-       - Removed values: 36 / 72382 (0.05%)
           country       date
          Ethiopia 2024-11-18
              Iraq 2024-11-11
           Jamaica 2024-11-18
            Uganda 2024-10-21
            Uganda 2024-10-28
    ~ Column denomcombined (new data, changed data)
+       + New values: 7 / 72382 (0.01%)
          country       date  denomcombined
           Brazil 2024-08-26           6810
           Brazil 2024-09-02           7056
           Brazil 2024-09-09           7321
           Brazil 2024-09-16           6452
           Mexico 2024-11-11            495
-       - Removed values: 36 / 72382 (0.05%)
           country       date  denomcombined
          Ethiopia 2024-11-18            152
              Iraq 2024-11-11             35
           Jamaica 2024-11-18             19
            Uganda 2024-10-21             49
            Uganda 2024-10-28             68
        ~ Changed values: 87 / 72382 (0.12%)
            country       date  denomcombined -  denomcombined +
          Argentina 2024-09-30              166              164
             Brazil 2024-11-04             6224             5906
          Indonesia 2024-08-12               48               47
           Paraguay 2024-05-13              200              201
           Paraguay 2024-08-19              328              327
    ~ Column pcnt_poscombined (new data, changed data)
+       + New values: 7 / 72382 (0.01%)
          country       date  pcnt_poscombined
           Brazil 2024-08-26          5.800294
           Brazil 2024-09-02          5.782313
           Brazil 2024-09-09          6.979921
           Brazil 2024-09-16          7.377557
           Mexico 2024-11-11          8.686869
-       - Removed values: 36 / 72382 (0.05%)
           country       date  pcnt_poscombined
          Ethiopia 2024-11-18         18.421053
              Iraq 2024-11-11         17.142857
           Jamaica 2024-11-18         10.526316
            Uganda 2024-10-21         14.285714
            Uganda 2024-10-28          8.823529
        ~ Changed values: 94 / 72382 (0.13%)
            country       date  pcnt_poscombined -  pcnt_poscombined +
          Argentina 2024-09-30           22.289156            21.95122
             Brazil 2024-11-04            7.583548            7.162208
              Chile 2024-04-29           45.319149           45.689655
              Egypt 2024-09-30            4.118993            4.147465
            Jamaica 2024-09-30            1.041667            1.428571


Legend: +New  ~Modified  -Removed  =Identical  Details
Hint: Run this locally with etl diff REMOTE data/ --include yourdataset --verbose --snippet

Automatically updated datasets matching weekly_wildfires|excess_mortality|covid|fluid|flunet|country_profile|garden/ihme_gbd/2019/gbd_risk are not included

Edited: 2024-12-02 10:59:05 UTC
Execution time: 30.24 seconds

Copy link
Member

@lucasrodes lucasrodes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM!

I think this is an improvement of the current status quo. We can continue thinking on how to make Jinja friendlier, ultimately. Maybe you can create an issue or add the future work you suggested there.

Thanks for improving the docs, too!

Copy link
Contributor

@spoonerf spoonerf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing, thank you for this!

@Marigold Marigold merged commit 9477e5c into master Dec 2, 2024
7 of 8 checks passed
@Marigold Marigold deleted the jinja-previews branch December 2, 2024 10:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants