You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It's often the case that we introduce unwanted linebreaks or whitespace in the metadata. These are sometimes rendered in the FASTT or the metadata of an indicator.
Current workflow
At the moment, to make sure that I do not introduce unwanted characters, I do the following:
Work on metadata, data steps.
After importing to the database, I go to a sample indicator's data page preview.
For instance, I might see the following chart, which has an unwanted linebreak before the period of the subtitle sentence:
Then, I'll open the browser inspector, and in the Network tab, look for the URL with the indicator metadata shown. I'll open it and explore to see if there are any other unwanted characters.
{
"id": 971388,
"name": "Deaths from leukaemia among females aged 0-4 year olds\n",
"unit": "deaths",
"createdAt": "2024-08-07T11:04:57.000Z",
"updatedAt": "2024-11-28T10:02:03.000Z",
"coverage": "",
"timespan": "2000-2021",
"datasetId": 6662,
"shortUnit": "",
"columnOrder": 0,
"shortName": "death_count__age_group_years0_4__sex_female__cause_leukaemia",
"catalogPath": "grapher/who/2024-07-30/ghe/ghe#death_count__age_group_years0_4__sex_female__cause_leukaemia",
"dimensions": {...},
"descriptionShort": "Estimated number of deaths from leukaemia among females aged 0-4 year olds\n.\n",
...
}
We observe that not only is there a line break (\n) before the period in descriptionShort, but also:
Linebreaks: One at the end of name and descriptionShort.
Whitespaces: double white space in descriptionShort, before "aged 0-4..."
I could also check the metadata JSON file in data/ folder, but it contains all the Jinja syntax, and is hard to read:
"death_count": {
"title": "<% if age_group == \"ALLAges\" %>\nTotal deaths from << cause.lower() >> among <% if sex == \"Both sexes\" %>both sexes<% elif sex == \"Male\" %>males<% elif sex == \"Female\" %>females<% endif %>\n<% elif age_group == \"Age-standardized\" %>\nAge-standardized deaths from << cause.lower() >> among <% if sex == \"Both sexes\" %>both sexes<% elif sex == \"Male\" %>males<% elif sex == \"Female\" %>females<% endif %>\n<% else %>\nDeaths from << cause.lower() >> among <% if sex == \"Both sexes\" %>both sexes<% elif sex == \"Male\" %>males<% elif sex == \"Female\" %>females<% endif %> aged <% if age_group == \"ALLAges\" %>\nall ages\n<% elif age_group == \"age-standardized\" %>\nan age-standardized population\n<% elif age_group == \"YEARS0-14\" %>\n0-14 year olds\n<% elif age_group == \"YEARS0-4\" %>\n0-4 year olds\n<% elif age_group == \"YEARS5-14\" %>\n5-14 year olds\n<% elif age_group == \"YEARS15-19\" %>\n15-19 year olds\n<% elif age_group == \"YEARS15-49\" %>\n15-49 year olds\n<% elif age_group == \"YEARS20-24\" %>\n20-24 year olds\n<% elif age_group == \"YEARS25-34\" %>\n25-34 year olds\n<% elif age_group == \"YEARS35-44\" %>\n35-44 year olds\n<% elif age_group == \"YEARS45-54\" %>\n45-54 year olds\n<% elif age_group == \"YEARS50-69\" %>\n50-69 year olds\n<% elif age_group == \"YEARS55-64\" %>\n55-64 year olds\n<% elif age_group == \"YEARS65-74\" %>\n65-74 year olds\n<% elif age_group == \"YEARS70+\" %>\n70+ year olds\n<% elif age_group == \"YEARS75-84\" %>\n75-84 year olds\n<% elif age_group == \"YEARS85PLUS\" %>\n85+ year olds\n<% endif %>\n<% endif %>",
"description_short": "<% if age_group == \"ALLAges\" %>\nEstimated number of deaths from << cause.lower() >> in <% if sex == \"Both sexes\" %>both sexes<% elif sex == \"Male\" %>males<% elif sex == \"Female\" %>females<% endif %>.\n<% elif age_group == \"Age-standardized\" %>\nEstimated number of age-standardized deaths from << cause.lower() >> in <% if sex == \"Both sexes\" %>both sexes<% elif sex == \"Male\" %>males<% elif sex == \"Female\" %>females<% endif %>.\n<% else %>\nEstimated number of deaths from << cause.lower() >> among <% if sex == \"Both sexes\" %>both sexes<% elif sex == \"Male\" %>males<% elif sex == \"Female\" %>females<% endif %> aged <% if age_group == \"ALLAges\" %>\nall ages\n<% elif age_group == \"age-standardized\" %>\nan age-standardized population\n<% elif age_group == \"YEARS0-14\" %>\n0-14 year olds\n<% elif age_group == \"YEARS0-4\" %>\n0-4 year olds\n<% elif age_group == \"YEARS5-14\" %>\n5-14 year olds\n<% elif age_group == \"YEARS15-19\" %>\n15-19 year olds\n<% elif age_group == \"YEARS15-49\" %>\n15-49 year olds\n<% elif age_group == \"YEARS20-24\" %>\n20-24 year olds\n<% elif age_group == \"YEARS25-34\" %>\n25-34 year olds\n<% elif age_group == \"YEARS35-44\" %>\n35-44 year olds\n<% elif age_group == \"YEARS45-54\" %>\n45-54 year olds\n<% elif age_group == \"YEARS50-69\" %>\n50-69 year olds\n<% elif` `age_group == \"YEARS55-64\" %>\n55-64 year olds\n<% elif age_group == \"YEARS65-74\" %>\n65-74 year olds\n<% elif age_group == \"YEARS70+\" %>\n70+ year olds\n<% elif age_group == \"YEARS75-84\" %>\n75-84 year olds\n<% elif age_group == \"YEARS85PLUS\" %>\n85+ year olds\n<% endif %>.\n<% endif %>",
Comments
My current workaround is a bit complex, and can't expect everyone to do this. Also, can be very time consuming if it takes a while to bake the dataset and import it to the database.
The origin of these unwanted characters is in Jinja's "misuse". So, ideally, we wouldn't insert these into the ETL metadata YAML files.
But this is a bit tricky because Jinja can be confusing at times. We may need something that helps us here.
A temporary workaround could be to format this (remove unwanted spacings) when rendering the JSON metadata files.
At the moment, I think we are doing some formatting only at the very last moment (when the indicator is shown on the site). Still some unwanted characters make it through (e.g. the linebreak before the period).
The text was updated successfully, but these errors were encountered:
We discussed it in triage. @Marigold already shipped some improvements here, we might close this and keep the related issues for other kinds of follow-up.
It's often the case that we introduce unwanted linebreaks or whitespace in the metadata. These are sometimes rendered in the FASTT or the metadata of an indicator.
Current workflow
At the moment, to make sure that I do not introduce unwanted characters, I do the following:
\n
) before the period indescriptionShort
, but also:name
anddescriptionShort
.descriptionShort
, before "aged 0-4..."I could also check the metadata JSON file in
data/
folder, but it contains all the Jinja syntax, and is hard to read:Comments
My current workaround is a bit complex, and can't expect everyone to do this. Also, can be very time consuming if it takes a while to bake the dataset and import it to the database.
The origin of these unwanted characters is in Jinja's "misuse". So, ideally, we wouldn't insert these into the ETL metadata YAML files.
But this is a bit tricky because Jinja can be confusing at times. We may need something that helps us here.
A temporary workaround could be to format this (remove unwanted spacings) when rendering the JSON metadata files.
At the moment, I think we are doing some formatting only at the very last moment (when the indicator is shown on the site). Still some unwanted characters make it through (e.g. the linebreak before the period).
The text was updated successfully, but these errors were encountered: