-
-
Notifications
You must be signed in to change notification settings - Fork 22
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
📊 update life expectancy dependencies (#3675)
* 📊 update life expectancy dependencies * wip: un_wpp_lt * 🐛 ci/cd broken hmd grapher * fix dimension error in jinja * typo * wip * wip * wip * wip * wip * change version * re-structure * minor fix * wip * exclude countries * wip * wip * wip * wip * wip lt * man outsurvival probability to woman * archive * gini_le * success message in anomalist * bug in index access * wip * wip * wip * upgrade dependency * archive datasets
- Loading branch information
1 parent
f9d9a42
commit f860f1d
Showing
31 changed files
with
2,406 additions
and
84 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
51 changes: 51 additions & 0 deletions
51
etl/steps/data/garden/demography/2024-12-03/broken_limits_le.meta.yml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,51 @@ | ||
# NOTE: To learn more about the fields, hover over their names. | ||
definitions: | ||
common: | ||
presentation: | ||
topic_tags: | ||
- Life Expectancy | ||
|
||
|
||
# Learn more about the available fields: | ||
# http://docs.owid.io/projects/etl/architecture/metadata/reference/dataset/ | ||
dataset: | ||
title: "Life Expectancy: Broken limits" | ||
update_period_days: 365 | ||
|
||
|
||
# Learn more about the available fields: | ||
# http://docs.owid.io/projects/etl/architecture/metadata/reference/tables/ | ||
tables: | ||
broken_limits_le: | ||
variables: | ||
life_expectancy: | ||
title: &le_name Maximum life expectancy | ||
unit: years | ||
description_short: |- | ||
<%- if (sex == 'female') -%> | ||
Maximum life expectancy recorded in a given year (among females). | ||
<%- elif (sex == 'male') -%> | ||
Maximum life expectancy recorded in a given year (among males). | ||
<%- elif (sex == 'all') -%> | ||
Maximum life expectancy recorded in a given year. | ||
<%- endif -%> | ||
description_key: | ||
- Period life expectancy is a metric that summarizes death rates across all age groups in one particular year. For a given year, it represents the average lifespan for a hypothetical group of people, if they experienced the same age-specific death rates throughout their lives as the age-specific death rates seen in that particular year. | ||
- Records are only shown for countries in the Human Mortality Database. Prior to 1950, we use HMD (2023) data. From 1950 onwards, we use UN WPP (2022) data. | ||
display: | ||
name: *le_name | ||
presentation: | ||
title_public: *le_name | ||
title_variant: "" | ||
attribution_short: HMD; UN WPP | ||
topic_tags: | ||
- Life Expectancy | ||
grapher_config: | ||
hasMapTab: true | ||
|
||
country_with_max_le: | ||
title: Country with yearly maximum life expectancy | ||
unit: "" | ||
description_short: |- | ||
Name of the country with the yearly maximum life expectancy registered<%- if (sex == 'female') %> among females<% elif (sex == 'male') %> among males<% endif -%>. | ||
description_processing: This indicator is meant to be used as an auxiliary indicator. |
76 changes: 76 additions & 0 deletions
76
etl/steps/data/garden/demography/2024-12-03/broken_limits_le.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,76 @@ | ||
"""Load a meadow dataset and create a garden dataset. | ||
We only consider data from countries that are present in HMD. And, additionally, we only consider entries for these countries since the year they first appear in the HMD dataset (even if for that period we use UN WPP data, i.e. post-1950) | ||
""" | ||
|
||
from owid.catalog import Table | ||
|
||
from etl.helpers import PathFinder, create_dataset | ||
|
||
# Get paths and naming conventions for current step. | ||
paths = PathFinder(__file__) | ||
# Year to start tracking. Note that in the first years, few countries have data. Hence, we start in a later year, where more countries have data. | ||
YEAR_FIRST = 1840 | ||
|
||
|
||
def run(dest_dir: str) -> None: | ||
# | ||
# Load inputs. | ||
# | ||
# Load meadow dataset. | ||
ds_meadow = paths.load_dataset("life_tables") | ||
ds_hmd = paths.load_dataset("hmd") | ||
|
||
# Read table from meadow dataset. | ||
tb = ds_meadow.read("life_tables", reset_index=False) | ||
tb_hmd = ds_hmd.read("life_tables") | ||
|
||
# | ||
# Process data. | ||
# | ||
# Filter relevant dimensions | ||
tb = tb.loc[(slice(None), slice(None), slice(None), "0", "period"), ["life_expectancy"]].reset_index() | ||
|
||
# Keep relevant columns and rows | ||
tb = tb.drop(columns=["type", "age"]).dropna() | ||
|
||
# Rename column | ||
tb = tb.rename(columns={"location": "country"}) | ||
|
||
# Get country-sex and first year of LE reported in HMD | ||
tb_hmd = get_first_year_of_country_in_hmd(tb_hmd) | ||
|
||
# Only preserve countries coming from HDM | ||
tb = tb.merge(tb_hmd, on=["country", "sex"], suffixes=("", "_min")) | ||
tb = tb[tb["year"] >= tb["year_min"]].drop(columns=["year_min"]) | ||
|
||
# Get max for each year | ||
tb = tb.loc[tb.groupby(["year", "sex"], observed=True)["life_expectancy"].idxmax()] | ||
|
||
# Organise columns | ||
tb["country_with_max_le"] = tb["country"] | ||
tb["country"] = tb["country"] + " " + tb["year"].astype("string") | ||
|
||
# First year | ||
tb = tb[tb["year"] >= YEAR_FIRST] | ||
|
||
# Set index | ||
tb = tb.format(["country", "year", "sex"], short_name="broken_limits_le") | ||
|
||
# | ||
# Save outputs. | ||
# | ||
# Create a new garden dataset with the same metadata as the meadow dataset. | ||
ds_garden = create_dataset( | ||
dest_dir, tables=[tb], check_variables_metadata=True, default_metadata=ds_meadow.metadata | ||
) | ||
|
||
# Save changes in the new garden dataset. | ||
ds_garden.save() | ||
|
||
|
||
def get_first_year_of_country_in_hmd(tb_hmd: Table) -> Table: | ||
tb_hmd = tb_hmd.loc[(tb_hmd["type"] == "period") & (tb_hmd["age"] == "0")] | ||
tb_hmd = tb_hmd.loc[:, ["country", "year", "sex", "life_expectancy"]].dropna() | ||
tb_hmd = tb_hmd.groupby(["country", "sex"], observed=True, as_index=False)["year"].min() | ||
return tb_hmd |
Oops, something went wrong.