Skip to content

Commit

Permalink
📊 antibiotics: microbe total deaths from pathogens (#3691)
Browse files Browse the repository at this point in the history
* 📊 microbe total deaths from pathogend

* adding total deaths from pathogens

* adding attributable deaths and non-attributable deaths
  • Loading branch information
spoonerf authored Dec 5, 2024
1 parent b20b45b commit 2bbc175
Show file tree
Hide file tree
Showing 16 changed files with 462 additions and 9 deletions.
15 changes: 15 additions & 0 deletions dag/health.yml
Original file line number Diff line number Diff line change
Expand Up @@ -969,3 +969,18 @@ steps:
- data://meadow/antibiotics/2024-12-03/glass_enrolment
data://grapher/antibiotics/2024-12-03/glass_enrolment:
- data://garden/antibiotics/2024-12-03/glass_enrolment
# MICROBE - total deaths by pathogen
data-private://meadow/antibiotics/2024-12-04/microbe_total_pathogens:
- snapshot-private://antibiotics/2024-12-04/microbe_total_pathogens.csv
data-private://garden/antibiotics/2024-12-04/microbe_total_pathogens:
- data-private://meadow/antibiotics/2024-12-04/microbe_total_pathogens
data-private://grapher/antibiotics/2024-12-04/microbe_total_pathogens:
- data-private://garden/antibiotics/2024-12-04/microbe_total_pathogens
# MICROBE - total deaths by pathogen and amr resistance
data-private://meadow/antibiotics/2024-12-04/microbe_total_pathogens_amr:
- snapshot-private://antibiotics/2024-12-04/microbe_total_pathogens_amr.csv
data-private://garden/antibiotics/2024-12-04/microbe_total_pathogens_amr:
- data-private://meadow/antibiotics/2024-12-04/microbe_total_pathogens_amr
- data-private://garden/antibiotics/2024-12-04/microbe_total_pathogens
data-private://grapher/antibiotics/2024-12-04/microbe_total_pathogens_amr:
- data-private://garden/antibiotics/2024-12-04/microbe_total_pathogens_amr
Original file line number Diff line number Diff line change
Expand Up @@ -15,31 +15,31 @@ tables:
total_pathogen_bloodstream:
variables:
value:
title: Total deaths from << pathogen >> infections
title: Total deaths from << pathogen >> bloodstream infections
unit: deaths
description_short: Estimated number of deaths << pathogen >> infections. << pathogen >> is a {definitions.pathogen_type}.
description_short: Estimated number of deaths << pathogen >> bloodstream infections. << pathogen >> is a {definitions.pathogen_type}.
presentation:
title_public: Total deaths from << pathogen >> infections
title_public: Total deaths from << pathogen >> bloodstream infections
display:
roundingMode: significantFigures
numSignificantFigures: 3
name: << pathogen >>
upper:
title: Upper bound of total deaths from << pathogen >> infections
title: Upper bound of total deaths from << pathogen >> bloodstream infections
unit: deaths
description_short: Estimated number of deaths << pathogen >> infections. << pathogen >> is a {definitions.pathogen_type}.
description_short: Estimated number of deaths << pathogen >> bloodstream infections. << pathogen >> is a {definitions.pathogen_type}.
presentation:
title_public: Upper bound of total deaths from << pathogen >> infections
title_public: Upper bound of total deaths from << pathogen >> bloodstream infections
display:
roundingMode: significantFigures
numSignificantFigures: 3
name: << pathogen >>
lower:
title: Lower bound of total deaths from << pathogen >> infections
title: Lower bound of total deaths from << pathogen >> bloodstream infections
unit: deaths
description_short: Estimated number of deaths << pathogen >> infections. << pathogen >> is a {definitions.pathogen_type}.
description_short: Estimated number of deaths << pathogen >> bloodstream infections. << pathogen >> is a {definitions.pathogen_type}.
presentation:
title_public: Lower bound of total deaths from << pathogen >> infections
title_public: Lower bound of total deaths from << pathogen >> bloodstream infections
display:
roundingMode: significantFigures
numSignificantFigures: 3
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
{
"Global": "World"
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
# NOTE: To learn more about the fields, hover over their names.
definitions:
common:
presentation:
topic_tags:
- Antibiotics
pathogen_type: <% if pathogen_type == "Fungi" %>fungus<% elif pathogen_type == "Viruses" %>virus<% else %><< pathogen_type.lower() >><% endif %>

# Learn more about the available fields:
# http://docs.owid.io/projects/etl/architecture/metadata/reference/
dataset:
update_period_days: 365

tables:
microbe_total_pathogens:
variables:
value:
title: Total deaths from << pathogen >> infections
unit: deaths
description_short: Estimated number of deaths << pathogen >> infections. << pathogen >> is a {definitions.pathogen_type}.
presentation:
title_public: Total deaths from << pathogen >> infections
display:
roundingMode: significantFigures
numSignificantFigures: 3
name: << pathogen >>
upper:
title: Upper bound of total deaths from << pathogen >> infections
unit: deaths
description_short: Estimated number of deaths << pathogen >> infections. << pathogen >> is a {definitions.pathogen_type}.
presentation:
title_public: Upper bound of total deaths from << pathogen >> infections
display:
roundingMode: significantFigures
numSignificantFigures: 3
name: << pathogen >>
lower:
title: Lower bound of total deaths from << pathogen >> infections
unit: deaths
description_short: Estimated number of deaths << pathogen >> infections. << pathogen >> is a {definitions.pathogen_type}.
presentation:
title_public: Lower bound of total deaths from << pathogen >> infections
display:
roundingMode: significantFigures
numSignificantFigures: 3
name: << pathogen >>
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
"""Load a meadow dataset and create a garden dataset."""

from etl.data_helpers import geo
from etl.helpers import PathFinder, create_dataset

# Get paths and naming conventions for current step.
paths = PathFinder(__file__)


def run(dest_dir: str) -> None:
#
# Load inputs.
#
# Load meadow dataset.
ds_meadow = paths.load_dataset("microbe_total_pathogens")

# Read table from meadow dataset.
tb = ds_meadow.read("microbe_total_pathogens")

#
# Process data.
#
tb = geo.harmonize_countries(df=tb, countries_file=paths.country_mapping_path)
tb = tb.format(["country", "year", "pathogen", "pathogen_type"])

#
# Save outputs.
#
# Create a new garden dataset with the same metadata as the meadow dataset.
ds_garden = create_dataset(
dest_dir, tables=[tb], check_variables_metadata=True, default_metadata=ds_meadow.metadata
)

# Save changes in the new garden dataset.
ds_garden.save()
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
{
"Global": "World"
}
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
# NOTE: To learn more about the fields, hover over their names.
definitions:
common:
presentation:
topic_tags:
- Antibiotics

# Learn more about the available fields:
# http://docs.owid.io/projects/etl/architecture/metadata/reference/
dataset:
update_period_days: 365

tables:
microbe_total_pathogens_amr:
variables:
amr_attributable_deaths:
title: Total deaths from infections attributed to AMR, by pathogen
unit: deaths
description_short: Estimated number of deaths from infections that are attributed to antimicrobial resistance.
presentation:
title_public: Total deaths from infections attributed to AMR, by pathogen
display:
roundingMode: significantFigures
numSignificantFigures: 3
non_amr_attributable_deaths:
title: Total global deaths from infections not attributed to AMR, by pathogen
unit: deaths
description_short: Estimated number of deaths from infections that are not attributed to antimicrobial resistance.
presentation:
title_public: Total global deaths from infections not attributed to AMR, by pathogen
display:
roundingMode: significantFigures
numSignificantFigures: 3
total_deaths:
title: Total global deaths from infections
unit: deaths
description_short: Estimated number of deaths from infections.
presentation:
title_public: Total global deaths from infections
display:
roundingMode: significantFigures
numSignificantFigures: 3
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
"""Load a meadow dataset and create a garden dataset."""

from etl.data_helpers import geo
from etl.helpers import PathFinder, create_dataset

# Get paths and naming conventions for current step.
paths = PathFinder(__file__)


def run(dest_dir: str) -> None:
#
# Load inputs.
#
# Load meadow dataset.
ds_meadow = paths.load_dataset("microbe_total_pathogens_amr")
ds_total = paths.load_dataset("microbe_total_pathogens")

# Read table from meadow dataset.
tb = (
ds_meadow.read("microbe_total_pathogens_amr")
.drop(columns=["upper", "lower"])
.rename(columns={"value": "amr_attributable_deaths"})
)
tb_total = (
ds_total.read("microbe_total_pathogens")
.drop(columns=["upper", "lower"])
.rename(columns={"value": "total_deaths"})
)
#
# Process data.
#
tb = geo.harmonize_countries(
df=tb,
countries_file=paths.country_mapping_path,
)

tb = tb.merge(tb_total, on=["country", "year", "pathogen", "pathogen_type"], how="right")

tb["amr_attributable_deaths"] = tb["amr_attributable_deaths"].fillna(0)
tb["non_amr_attributable_deaths"] = tb["total_deaths"] - tb["amr_attributable_deaths"]
# Process data.
tb = tb.drop(columns=["country", "pathogen_type"]).rename(columns={"pathogen": "country"})

tb = tb.format(["country", "year"])

#
# Save outputs.
#
# Create a new garden dataset with the same metadata as the meadow dataset.
ds_garden = create_dataset(
dest_dir, tables=[tb], check_variables_metadata=True, default_metadata=ds_meadow.metadata
)

# Save changes in the new garden dataset.
ds_garden.save()
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
"""Load a garden dataset and create a grapher dataset."""

from etl.helpers import PathFinder, create_dataset

# Get paths and naming conventions for current step.
paths = PathFinder(__file__)


def run(dest_dir: str) -> None:
#
# Load inputs.
#
# Load garden dataset.
ds_garden = paths.load_dataset("microbe_total_pathogens")

# Read table from garden dataset.
tb = ds_garden.read("microbe_total_pathogens", reset_index=False)

#
# Save outputs.
#
# Create a new grapher dataset with the same metadata as the garden dataset.
ds_grapher = create_dataset(
dest_dir, tables=[tb], check_variables_metadata=True, default_metadata=ds_garden.metadata
)

# Save changes in the new grapher dataset.
ds_grapher.save()
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
"""Load a garden dataset and create a grapher dataset."""

from etl.helpers import PathFinder, create_dataset

# Get paths and naming conventions for current step.
paths = PathFinder(__file__)


def run(dest_dir: str) -> None:
#
# Load inputs.
#
# Load garden dataset.
ds_garden = paths.load_dataset("microbe_total_pathogens_amr")

# Read table from garden dataset.
tb = ds_garden.read("microbe_total_pathogens_amr", reset_index=False)

#
# Save outputs.
#
# Create a new grapher dataset with the same metadata as the garden dataset.
ds_grapher = create_dataset(
dest_dir, tables=[tb], check_variables_metadata=True, default_metadata=ds_garden.metadata
)

# Save changes in the new grapher dataset.
ds_grapher.save()
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
"""Load a snapshot and create a meadow dataset."""

from etl.helpers import PathFinder, create_dataset

# Get paths and naming conventions for current step.
paths = PathFinder(__file__)


def run(dest_dir: str) -> None:
#
# Load inputs.
#
# Retrieve snapshot.
snap = paths.load_snapshot("microbe_total_pathogens.csv")

# Load data from snapshot.
tb = snap.read()
assert all(tb["Age"] == "All Ages")
assert all(tb["Sex"] == "Both sexes")
assert all(tb["Measure"] == "Deaths")
assert all(tb["Metric"] == "Number")
assert all(tb["Counterfactual"] == "Total")
assert all(tb["Infectious syndrome"] == "All infectious syndromes")

#
# Process data.
tb = tb.drop(columns=["Age", "Sex", "Measure", "Metric", "Infectious syndrome", "Counterfactual"])
tb = tb.rename(columns={"Location": "country", "Year": "year", "Pathogen": "pathogen"})
# Ensure all columns are snake-case, set an appropriate index, and sort conveniently.
tb = tb.format(["country", "year", "pathogen"])

#
# Save outputs.
#
# Create a new meadow dataset with the same metadata as the snapshot.
ds_meadow = create_dataset(dest_dir, tables=[tb], check_variables_metadata=True, default_metadata=snap.metadata)

# Save changes in the new meadow dataset.
ds_meadow.save()
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
"""Load a snapshot and create a meadow dataset."""

from etl.helpers import PathFinder, create_dataset

# Get paths and naming conventions for current step.
paths = PathFinder(__file__)


def run(dest_dir: str) -> None:
#
# Load inputs.
#
# Retrieve snapshot.
snap = paths.load_snapshot("microbe_total_pathogens_amr.csv")

# Load data from snapshot.
tb = snap.read()
assert all(tb["Age"] == "All Ages")
assert all(tb["Sex"] == "Both sexes")
assert all(tb["Measure"] == "Deaths")
assert all(tb["Metric"] == "Number")
assert all(tb["Counterfactual"] == "Attributable")
assert all(tb["Infectious syndrome"] == "All infectious syndromes")

#
# Process data.
tb = tb.drop(columns=["Age", "Sex", "Measure", "Metric", "Infectious syndrome", "Counterfactual"])
tb = tb.rename(columns={"Location": "country", "Year": "year", "Pathogen": "pathogen"})
# Ensure all columns are snake-case, set an appropriate index, and sort conveniently.
tb = tb.format(["country", "year", "pathogen"])

#
# Save outputs.
#
# Create a new meadow dataset with the same metadata as the snapshot.
ds_meadow = create_dataset(dest_dir, tables=[tb], check_variables_metadata=True, default_metadata=snap.metadata)

# Save changes in the new meadow dataset.
ds_meadow.save()
Loading

0 comments on commit 2bbc175

Please sign in to comment.