Skip to content

Commit

Permalink
📊 Preventive chemotherapy for neglected tropical diseases (#2588)
Browse files Browse the repository at this point in the history
* lymphatic filiarisis snap -> garden

* adding schist and soil helminthiases

* getting the ntds to garden

* experimental grapher

* dag

* ntd changes

* fixing sth

* adding schistosomiasis grapher step

* sorting out lymphatic filariasis

* changing no data values

* cleaning producer

* separating out the national coverage indicators for STH

* unborking the metadata

* adding regional schistosomiasis

* adding regions to LF

* adding regional data for STH

* fixing sth aggs

* metadata

* metadata fix

* fixing aggregates in LF

* fixing error

* adding national data to grapher for STH

* format

* updating snapshot names

* removing aggregates where it ignores the drug type dimension

* adding estimated number treated

* adding estimated numbers treated at national level for STH and LF

* adding lucas's suggestion
  • Loading branch information
spoonerf authored May 21, 2024
1 parent 7bdd47e commit 64ef26e
Show file tree
Hide file tree
Showing 22 changed files with 1,231 additions and 0 deletions.
30 changes: 30 additions & 0 deletions dag/health.yml
Original file line number Diff line number Diff line change
Expand Up @@ -563,10 +563,40 @@ steps:
data://grapher/who/2024-04-26/avian_influenza_ah5n1:
- data://garden/who/2024-04-26/avian_influenza_ah5n1


# WHO Preventive Chemotherapy - Neglected Tropical Diseases
# Lymphatic filariasis
data://meadow/neglected_tropical_diseases/2024-05-02/lymphatic_filariasis:
- snapshot://neglected_tropical_diseases/2024-05-02/lymphatic_filariasis.xlsx
data://garden/neglected_tropical_diseases/2024-05-02/lymphatic_filariasis:
- data://meadow/neglected_tropical_diseases/2024-05-02/lymphatic_filariasis
- data://garden/regions/2023-01-01/regions
data://grapher/neglected_tropical_diseases/2024-05-02/lymphatic_filariasis:
- data://garden/neglected_tropical_diseases/2024-05-02/lymphatic_filariasis

# Schistosomiasis
data://meadow/neglected_tropical_diseases/2024-05-02/schistosomiasis:
- snapshot://neglected_tropical_diseases/2024-05-02/schistosomiasis.xlsx
data://garden/neglected_tropical_diseases/2024-05-02/schistosomiasis:
- data://meadow/neglected_tropical_diseases/2024-05-02/schistosomiasis
- data://garden/regions/2023-01-01/regions
data://grapher/neglected_tropical_diseases/2024-05-02/schistosomiasis:
- data://garden/neglected_tropical_diseases/2024-05-02/schistosomiasis

# Soil-transmitted helminthiases
data://meadow/neglected_tropical_diseases/2024-05-02/soil_transmitted_helminthiases:
- snapshot://neglected_tropical_diseases/2024-05-02/soil_transmitted_helminthiases.xlsx
data://garden/neglected_tropical_diseases/2024-05-02/soil_transmitted_helminthiases:
- data://meadow/neglected_tropical_diseases/2024-05-02/soil_transmitted_helminthiases
- data://garden/regions/2023-01-01/regions
data://grapher/neglected_tropical_diseases/2024-05-02/soil_transmitted_helminthiases:
- data://garden/neglected_tropical_diseases/2024-05-02/soil_transmitted_helminthiases

# Neglected Tropical Diseases Funding
data://meadow/neglected_tropical_diseases/2024-05-18/funding:
- snapshot://neglected_tropical_diseases/2024-05-18/funding.xlsx
data://garden/neglected_tropical_diseases/2024-05-18/funding:
- data://meadow/neglected_tropical_diseases/2024-05-18/funding
data://grapher/neglected_tropical_diseases/2024-05-18/funding:
- data://garden/neglected_tropical_diseases/2024-05-18/funding

Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
{
"American Samoa": "American Samoa",
"Angola": "Angola",
"Bangladesh": "Bangladesh",
"Benin": "Benin",
"Brazil": "Brazil",
"Brunei Darussalam": "Brunei",
"Burkina Faso": "Burkina Faso",
"Burundi": "Burundi",
"Cabo Verde": "Cape Verde",
"Cambodia": "Cambodia",
"Cameroon": "Cameroon",
"Central African Republic": "Central African Republic",
"Chad": "Chad",
"Comoros": "Comoros",
"Congo": "Congo",
"Cook Islands": "Cook Islands",
"Costa Rica": "Costa Rica",
"C\u00f4te d'Ivoire": "Cote d'Ivoire",
"Democratic Republic of the Congo": "Democratic Republic of Congo",
"Dominican Republic": "Dominican Republic",
"Egypt": "Egypt",
"Equatorial Guinea": "Equatorial Guinea",
"Eritrea": "Eritrea",
"Ethiopia": "Ethiopia",
"Fiji": "Fiji",
"French Polynesia": "French Polynesia",
"Gabon": "Gabon",
"Gambia": "Gambia",
"Ghana": "Ghana",
"Guinea": "Guinea",
"Guinea-Bissau": "Guinea-Bissau",
"Guyana": "Guyana",
"Haiti": "Haiti",
"India": "India",
"Indonesia": "Indonesia",
"Kenya": "Kenya",
"Kiribati": "Kiribati",
"Lao People's Democratic Republic": "Laos",
"Liberia": "Liberia",
"Madagascar": "Madagascar",
"Malawi": "Malawi",
"Malaysia": "Malaysia",
"Maldives": "Maldives",
"Mali": "Mali",
"Marshall Islands": "Marshall Islands",
"Mauritius": "Mauritius",
"Micronesia (Federated States of)": "Micronesia (country)",
"Mozambique": "Mozambique",
"Myanmar": "Myanmar",
"Nepal": "Nepal",
"New Caledonia": "New Caledonia",
"Niger": "Niger",
"Nigeria": "Nigeria",
"Niue": "Niue",
"Palau": "Palau",
"Papua New Guinea": "Papua New Guinea",
"Philippines": "Philippines",
"Rwanda": "Rwanda",
"Samoa": "Samoa",
"Sao Tome and Principe": "Sao Tome and Principe",
"Senegal": "Senegal",
"Seychelles": "Seychelles",
"Sierra Leone": "Sierra Leone",
"Solomon Islands": "Solomon Islands",
"South Sudan": "South Sudan",
"Sri Lanka": "Sri Lanka",
"Sudan": "Sudan",
"Suriname": "Suriname",
"Thailand": "Thailand",
"Timor-Leste": "East Timor",
"Togo": "Togo",
"Tonga": "Tonga",
"Trinidad and Tobago": "Trinidad and Tobago",
"Tuvalu": "Tuvalu",
"Uganda": "Uganda",
"United Republic of Tanzania": "Tanzania",
"Vanuatu": "Vanuatu",
"Viet Nam": "Vietnam",
"Wallis and Futuna": "Wallis and Futuna",
"Yemen": "Yemen",
"Zambia": "Zambia",
"Zimbabwe": "Zimbabwe"
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
# NOTE: To learn more about the fields, hover over their names.
definitions:
common:
presentation:
topic_tags:
- Global Health
# - Neglected Tropical Diseases # Need to add once the tag is upgraded to topic tag and we have a slug for the page
processing_level: minor
# Learn more about the available fields:
# http://docs.owid.io/projects/etl/architecture/metadata/reference/
dataset:
update_period_days: 365

title: Preventive Chemotherapy (PC) Data Portal
tables:
lymphatic_filariasis:
variables:
current_status_of_mda:
title: Current status of MDA
unit: ""
number_of_ius_covered:
title: Number of implementation units covered
unit: ""
display:
numDecimalPlaces: 0
geographical_coverage__pct:
title: Geographical coverage (%)
description_short: "Geographical coverage of preventive chemotherapy for [lymphatic filariasis](#dod:lymphatic-filariasis)."
unit: "%"
display:
numDecimalPlaces: 1
total_population_of_ius:
title: Total population of implementation units
description_short: "Total population of implementation units. Implementation units are defined as geographic areas where health interventions are specifically designed, executed, and monitored to control or eliminate neglected tropical diseases effectively."
unit: "people"
display:
numDecimalPlaces: 0
reported_number_of_people_treated:
title: Reported number of people treated
description_short: "Reported number of people treated for [lymphatic filariasis](#dod:lymphatic-filariasis)."
unit: "people"
display:
numDecimalPlaces: 0
programme__drug__coverage__pct:
title: Programme coverage
description_short: "Programme coverage for preventive chemotherapy for [lymphatic filariasis](#dod:lymphatic-filariasis). The share of people who require preventive chemotherapy for [lymphatic filariasis](#dod:lymphatic-filariasis) who actually receive it."
unit: "%"
short_unit: "%"
display:
numDecimalPlaces: 3
lymphatic_filariasis_national:
variables:
national_coverage__pct:
title: National coverage
description_short: "Drug coverage out of estimated population who require it."
unit: "%"
short_unit: "%"
display:
numDecimalPlaces: 1
population_requiring_pc_for_lf:
title: Population requiring preventive chemotherapy for lymphatic filariasis
description_short: "Population requiring preventive chemotherapy for [lymphatic filariasis](#dod:lymphatic-filariasis)."
unit: "people"
display:
numDecimalPlaces: 0
estimated_number_of_people_treated:
title: Estimated number of people treated
description_short: "Estimated number of people treated for [lymphatic filariasis](#dod:lymphatic-filariasis)."
description_processing: To calculate the estimated number of people treated, we multiply the population requiring preventive chemotherapy by the national coverage.
unit: "people"
display:
numDecimalPlaces: 0
processing_level: major
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
"""Load a meadow dataset and create a garden dataset."""

from typing import List

import numpy as np
from owid.catalog import Dataset, Table
from owid.catalog import processing as pr

from etl.data_helpers import geo
from etl.helpers import PathFinder, create_dataset

# Get paths and naming conventions for current step.
paths = PathFinder(__file__)
REGIONS = ["North America", "South America", "Europe", "Africa", "Asia", "Oceania", "World"]


def run(dest_dir: str) -> None:
#
# Load inputs.
#
# Load meadow dataset.
ds_meadow = paths.load_dataset("lymphatic_filariasis")
# Load regions dataset.
ds_regions = paths.load_dataset("regions")

# Read table from meadow dataset.
tb = ds_meadow["lymphatic_filariasis"].reset_index()
#
# Harmonize countries
tb = geo.harmonize_countries(df=tb, countries_file=paths.country_mapping_path)
# Process data.
# There are separate rows for each combination of drugs used, but this is duplicate for `national_coverage__pct`, so we will extract this column and create a separate table for it

# In many cases the are two identical values for 'national_coverage__pct', for each country year, this de-duplicates them
tb_nat = (
tb[["country", "year", "national_coverage__pct", "population_requiring_pc_for_lf"]].copy().drop_duplicates()
)
tb_nat["estimated_number_of_people_treated"] = (
tb_nat["national_coverage__pct"] * tb_nat["population_requiring_pc_for_lf"] / 100
)
tb_nat = add_regions_to_selected_vars(
tb_nat,
cols=["country", "year", "population_requiring_pc_for_lf", "estimated_number_of_people_treated"],
ds_regions=ds_regions,
)
# There are a few cases with two values for some country-year combos, here we drop them because we are not sure which is the correct value
tb_nat = tb_nat.drop_duplicates(subset=["country", "year"])
tb_nat.metadata.short_name = "lymphatic_filariasis_national"
# Drop `national_coverage_pct` from tb
tb = tb.drop(
columns=["national_coverage__pct", "population_requiring_pc_for_lf", "region", "country_code", "mapping_status"]
)
# Replace "No data" with NaN
tb = tb.replace("No data", np.nan)
# Format the tables
tb = tb.format(["country", "year", "type_of_mda"])
tb_nat = tb_nat.format(["country", "year"])
#
# Save outputs.
#
# Create a new garden dataset with the same metadata as the meadow dataset.
ds_garden = create_dataset(
dest_dir, tables=[tb, tb_nat], check_variables_metadata=True, default_metadata=ds_meadow.metadata
)

# Save changes in the new garden dataset.
ds_garden.save()


def add_regions_to_selected_vars(tb: Table, cols: List[str], ds_regions: Dataset) -> Table:
"""
Adding regions to selected variables in the table and then combining the table with the original table
"""

tb_agg = geo.add_regions_to_table(
tb[cols],
regions=REGIONS,
ds_regions=ds_regions,
min_num_values_per_year=1,
)
tb_agg = tb_agg[tb_agg["country"].isin(REGIONS)]
tb = pr.concat([tb, tb_agg], axis=0, ignore_index=True)

return tb
Original file line number Diff line number Diff line change
@@ -0,0 +1,60 @@
{
"Angola": "Angola",
"Benin": "Benin",
"Botswana": "Botswana",
"Brazil": "Brazil",
"Burkina Faso": "Burkina Faso",
"Burundi": "Burundi",
"Cambodia": "Cambodia",
"Cameroon": "Cameroon",
"Central African Republic": "Central African Republic",
"Chad": "Chad",
"China": "China",
"Congo": "Congo",
"C\u00f4te d'Ivoire": "Cote d'Ivoire",
"Democratic Republic of the Congo": "Democratic Republic of Congo",
"Dominican Republic": "Dominican Republic",
"Egypt": "Egypt",
"Equatorial Guinea": "Equatorial Guinea",
"Eritrea": "Eritrea",
"Eswatini": "Eswatini",
"Ethiopia": "Ethiopia",
"Gabon": "Gabon",
"Gambia": "Gambia",
"Ghana": "Ghana",
"Guinea": "Guinea",
"Guinea-Bissau": "Guinea-Bissau",
"Indonesia": "Indonesia",
"Iraq": "Iraq",
"Kenya": "Kenya",
"Lao People's Democratic Republic": "Laos",
"Liberia": "Liberia",
"Libya": "Libya",
"Madagascar": "Madagascar",
"Malawi": "Malawi",
"Mali": "Mali",
"Mauritania": "Mauritania",
"Mozambique": "Mozambique",
"Namibia": "Namibia",
"Niger": "Niger",
"Nigeria": "Nigeria",
"Oman": "Oman",
"Philippines": "Philippines",
"Rwanda": "Rwanda",
"Sao Tome and Principe": "Sao Tome and Principe",
"Saudi Arabia": "Saudi Arabia",
"Senegal": "Senegal",
"Sierra Leone": "Sierra Leone",
"Somalia": "Somalia",
"South Africa": "South Africa",
"South Sudan": "South Sudan",
"Sudan": "Sudan",
"Suriname": "Suriname",
"Togo": "Togo",
"Uganda": "Uganda",
"United Republic of Tanzania": "Tanzania",
"Venezuela (Bolivarian Republic of)": "Venezuela",
"Yemen": "Yemen",
"Zambia": "Zambia",
"Zimbabwe": "Zimbabwe"
}
Loading

0 comments on commit 64ef26e

Please sign in to comment.