Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

📊 life expectancy #3681

Closed
wants to merge 39 commits into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
39 commits
Select commit Hold shift + click to select a range
4325061
📊 update life expectancy dependencies
lucasrodes Dec 2, 2024
97c57fa
wip: un_wpp_lt
lucasrodes Dec 2, 2024
caa66ae
🐛 ci/cd broken hmd grapher
lucasrodes Dec 2, 2024
8b1e4bd
fix dimension error in jinja
lucasrodes Dec 2, 2024
e9fdd2b
Merge branch 'bug-cicd-hmd-grapher' into data-life-expectancy-depende…
lucasrodes Dec 2, 2024
0eb2711
typo
lucasrodes Dec 2, 2024
09caad8
Merge branch 'bug-cicd-hmd-grapher' into data-life-expectancy-depende…
lucasrodes Dec 2, 2024
d8452fd
wip
lucasrodes Dec 2, 2024
2ee7a8e
wip
lucasrodes Dec 2, 2024
67edeed
Merge branch 'bug-cicd-hmd-grapher' into data-life-expectancy-depende…
lucasrodes Dec 2, 2024
4aa1b77
wip
lucasrodes Dec 2, 2024
291e83d
wip
lucasrodes Dec 2, 2024
f7b467e
Merge branch 'bug-cicd-hmd-grapher' into data-life-expectancy-depende…
lucasrodes Dec 2, 2024
1a81580
wip
lucasrodes Dec 2, 2024
d7bfab1
change version
lucasrodes Dec 2, 2024
c60ca9f
Merge branch 'bug-cicd-hmd-grapher' into data-life-expectancy-depende…
lucasrodes Dec 3, 2024
21d22ca
re-structure
lucasrodes Dec 3, 2024
29dd9a0
minor fix
lucasrodes Dec 3, 2024
b2f770f
fix
lucasrodes Dec 3, 2024
046961e
wip
lucasrodes Dec 3, 2024
2e27d40
exclude countries
lucasrodes Dec 3, 2024
1022224
wip
lucasrodes Dec 3, 2024
6389add
wip
lucasrodes Dec 3, 2024
6963099
wip
lucasrodes Dec 3, 2024
2075e09
wip
lucasrodes Dec 3, 2024
f25f084
wip lt
lucasrodes Dec 3, 2024
2d635f4
man outsurvival probability to woman
lucasrodes Dec 3, 2024
8a5a0fd
archive
lucasrodes Dec 3, 2024
1d41a0b
gini_le
lucasrodes Dec 3, 2024
b41011b
success message in anomalist
lucasrodes Dec 3, 2024
4cf858a
bug in index access
lucasrodes Dec 3, 2024
813aa1c
wip
lucasrodes Dec 3, 2024
80f111e
wip
lucasrodes Dec 3, 2024
2a2ce54
wip
lucasrodes Dec 3, 2024
7647f37
upgrade dependency
lucasrodes Dec 3, 2024
6df6d57
archive datasets
lucasrodes Dec 3, 2024
723da26
Merge branch 'master' into data-life-expectancy-dependencies
lucasrodes Dec 3, 2024
a55e25b
📊 life expectancy
lucasrodes Dec 3, 2024
11568f7
Merge branch 'data-life-expectancy-dependencies' into data-life-expec…
lucasrodes Dec 3, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion apps/wizard/app_pages/anomalist/app.py
Original file line number Diff line number Diff line change
Expand Up @@ -819,6 +819,7 @@ def _score_table(df: pd.DataFrame) -> pd.DataFrame:
# Show controls only if needed
if len(items) > items_per_page:
pagination.show_controls(mode="bar")

else:
st.success("Ha! We did not find any no anomalies in the selected datasets! What were the odds of that?")
# Reset state
set_states({"anomalist_datasets_submitted": False})
48 changes: 48 additions & 0 deletions dag/archive/demography.yml
Original file line number Diff line number Diff line change
Expand Up @@ -70,3 +70,51 @@ steps:
- data://garden/hmd/2023-09-19/hmd
data://grapher/demography/2023-09-27/survivor_percentiles:
- data://garden/demography/2023-09-27/survivor_percentiles

# Phi-gender life expectancy inequality
data://garden/demography/2023-10-03/phi_gender_le:
- data://garden/demography/2023-10-03/life_tables
data://grapher/demography/2023-10-03/phi_gender_le:
- data://garden/demography/2023-10-03/phi_gender_le

# Broken limits of Life Expectancy
data://garden/demography/2023-10-20/broken_limits_le:
- data://garden/demography/2023-10-03/life_tables
- data://garden/hmd/2023-09-19/hmd
data://grapher/demography/2023-10-20/broken_limits_le:
- data://garden/demography/2023-10-20/broken_limits_le

# Gini Life Expectancy Inequality
data://garden/demography/2023-10-04/gini_le:
- data://garden/demography/2023-10-03/life_tables
data://grapher/demography/2023-10-04/gini_le:
- data://garden/demography/2023-10-04/gini_le

# HMD
data://meadow/hmd/2023-09-19/hmd:
- snapshot://hmd/2023-09-18/hmd.zip
data://garden/hmd/2023-09-19/hmd:
- data://meadow/hmd/2023-09-19/hmd
data://grapher/hmd/2023-09-19/hmd:
- data://garden/hmd/2023-09-19/hmd
# UN WPP Life Tables
data://meadow/un/2023-10-02/un_wpp_lt:
- snapshot://un/2023-10-02/un_wpp_lt_all.zip
- snapshot://un/2023-10-02/un_wpp_lt_f.zip
- snapshot://un/2023-10-02/un_wpp_lt_m.zip
data://garden/un/2023-10-02/un_wpp_lt:
- data://meadow/un/2023-10-02/un_wpp_lt
# UN WPP + HMD Life Tables
data://garden/demography/2023-10-03/life_tables:
- data://garden/hmd/2023-09-19/hmd
- data://garden/un/2023-10-02/un_wpp_lt
data://grapher/demography/2023-10-04/life_tables:
- data://garden/demography/2023-10-03/life_tables
# OMM: Life Expectancy
data://garden/demography/2023-10-09/life_expectancy:
- data://garden/demography/2023-10-03/life_tables
- data://garden/demography/2023-10-10/zijdeman_et_al_2015
- data://garden/demography/2023-10-10/riley_2005
- data://garden/un/2022-07-11/un_wpp
data://grapher/demography/2023-10-10/life_expectancy:
- data://garden/demography/2023-10-09/life_expectancy
3 changes: 1 addition & 2 deletions dag/covid.yml
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ steps:
- data://garden/regions/2023-01-01/regions
# Demography
- data://garden/demography/2024-07-15/population
- data://garden/demography/2023-10-09/life_expectancy
- data://garden/demography/2024-12-03/life_expectancy
- data://garden/un/2024-07-12/un_wpp
# Econ
- data://garden/wb/2024-10-07/world_bank_pip
Expand Down Expand Up @@ -325,4 +325,3 @@ steps:
- grapher://grapher/covid/latest/covax
- grapher://grapher/covid/latest/infections_model
- grapher://grapher/covid/latest/xm_who

137 changes: 64 additions & 73 deletions dag/demography.yml
Original file line number Diff line number Diff line change
Expand Up @@ -111,17 +111,32 @@ steps:
data://grapher/demography/2023-07-03/world_population_comparison:
- data://garden/demography/2023-06-27/world_population_comparison

# Maddison working paper (2022)
data://meadow/ggdc/2024-01-19/maddison_federico_paper:
- snapshot://ggdc/2024-01-19/maddison_federico_paper.xlsx
data://garden/ggdc/2024-01-19/maddison_federico_paper:
- data://meadow/ggdc/2024-01-19/maddison_federico_paper

# UN WPP experiments
data://garden/un/2024-03-14/un_wpp_most:
- data://garden/un/2022-07-11/un_wpp
data://grapher/un/2024-03-14/un_wpp_most:
- data://garden/un/2024-03-14/un_wpp_most
########################################################################
# Life expectancy #
########################################################################

# HMD
data://meadow/hmd/2023-09-19/hmd:
- snapshot://hmd/2023-09-18/hmd.zip
data://garden/hmd/2023-09-19/hmd:
- data://meadow/hmd/2023-09-19/hmd
data://grapher/hmd/2023-09-19/hmd:
- data://garden/hmd/2023-09-19/hmd
# Zijdeman et al
data://meadow/demography/2023-10-10/zijdeman_et_al_2015:
- snapshot://demography/2023-10-10/zijdeman_et_al_2015.xlsx
data://garden/demography/2023-10-10/zijdeman_et_al_2015:
- data://meadow/demography/2023-10-10/zijdeman_et_al_2015

# Riley
data://meadow/demography/2023-10-10/riley_2005:
- snapshot://demography/2023-10-10/riley_2005.pdf
data://garden/demography/2023-10-10/riley_2005:
- data://meadow/demography/2023-10-10/riley_2005

# Human Mortality Database
data://meadow/hmd/2024-12-01/hmd:
Expand All @@ -131,53 +146,54 @@ steps:
data://grapher/hmd/2024-12-01/hmd:
- data://garden/hmd/2024-12-01/hmd

# Gini Life Expectancy Inequality
data://garden/demography/2023-10-04/gini_le:
- data://garden/demography/2023-10-03/life_tables
data://grapher/demography/2023-10-04/gini_le:
- data://garden/demography/2023-10-04/gini_le

# Phi-gender life expectancy inequality
data://garden/demography/2023-10-03/phi_gender_le:
- data://garden/demography/2023-10-03/life_tables
data://grapher/demography/2023-10-03/phi_gender_le:
- data://garden/demography/2023-10-03/phi_gender_le

# UN WPP Life Tables
data://meadow/un/2023-10-02/un_wpp_lt:
- snapshot://un/2023-10-02/un_wpp_lt_all.zip
- snapshot://un/2023-10-02/un_wpp_lt_f.zip
- snapshot://un/2023-10-02/un_wpp_lt_m.zip
data://garden/un/2023-10-02/un_wpp_lt:
- data://meadow/un/2023-10-02/un_wpp_lt
data://meadow/un/2024-12-02/un_wpp_lt:
- snapshot://un/2024-12-02/un_wpp_lt_m.csv
- snapshot://un/2024-12-02/un_wpp_lt_all.csv
- snapshot://un/2024-12-02/un_wpp_lt_f.csv
data://garden/un/2024-12-02/un_wpp_lt:
- data://meadow/un/2024-12-02/un_wpp_lt

# UN WPP + HMD Life Tables
data://garden/demography/2023-10-03/life_tables:
- data://garden/hmd/2023-09-19/hmd
- data://garden/un/2023-10-02/un_wpp_lt
data://grapher/demography/2023-10-04/life_tables:
- data://garden/demography/2023-10-03/life_tables

# Zijdeman et al
data://meadow/demography/2023-10-10/zijdeman_et_al_2015:
- snapshot://demography/2023-10-10/zijdeman_et_al_2015.xlsx
data://garden/demography/2023-10-10/zijdeman_et_al_2015:
- data://meadow/demography/2023-10-10/zijdeman_et_al_2015
# Survivorship ages (HMD-derived)
data://garden/demography/2024-12-02/survivor_percentiles:
- data://garden/hmd/2024-12-01/hmd
data://grapher/demography/2024-12-02/survivor_percentiles:
- data://garden/demography/2024-12-02/survivor_percentiles

# Riley
data://meadow/demography/2023-10-10/riley_2005:
- snapshot://demography/2023-10-10/riley_2005.pdf
data://garden/demography/2023-10-10/riley_2005:
- data://meadow/demography/2023-10-10/riley_2005
# UN WPP + HMD Life Tables
data://garden/demography/2024-12-03/life_tables:
- data://garden/hmd/2024-12-01/hmd
- data://garden/un/2024-12-02/un_wpp_lt
data://grapher/demography/2024-12-03/life_tables:
- data://garden/demography/2024-12-03/life_tables

# OMM: Life Expectancy
data://garden/demography/2023-10-09/life_expectancy:
- data://garden/demography/2023-10-03/life_tables
- data://garden/demography/2023-10-10/zijdeman_et_al_2015
data://garden/demography/2024-12-03/life_expectancy:
- data://garden/demography/2023-10-10/riley_2005
- data://garden/un/2022-07-11/un_wpp
data://grapher/demography/2023-10-10/life_expectancy:
- data://garden/demography/2023-10-09/life_expectancy
- data://garden/demography/2023-10-10/zijdeman_et_al_2015
- data://garden/demography/2024-12-03/life_tables
- data://garden/un/2024-07-12/un_wpp
data://grapher/demography/2024-12-03/life_expectancy:
- data://garden/demography/2024-12-03/life_expectancy

# Broken limits of Life Expectancy
data://garden/demography/2024-12-03/broken_limits_le:
- data://garden/hmd/2024-12-01/hmd
- data://garden/demography/2024-12-03/life_tables
data://grapher/demography/2024-12-03/broken_limits_le:
- data://garden/demography/2024-12-03/broken_limits_le

# Phi-gender life expectancy inequality
data://garden/demography/2024-12-03/phi_gender_le:
- data://garden/demography/2024-12-03/life_tables
data://grapher/demography/2024-12-03/phi_gender_le:
- data://garden/demography/2024-12-03/phi_gender_le

# Gini Life Expectancy Inequality
data://garden/demography/2024-12-03/gini_le:
- data://garden/demography/2024-12-03/life_tables
data://grapher/demography/2024-12-03/gini_le:
- data://garden/demography/2024-12-03/gini_le

# Life Expectancy OECD
data://meadow/oecd/2023-10-11/life_expectancy_birth:
Expand All @@ -187,13 +203,6 @@ steps:
data://grapher/oecd/2023-10-11/life_expectancy_birth:
- data://garden/oecd/2023-10-11/life_expectancy_birth

# Broken limits of Life Expectancy
data://garden/demography/2023-10-20/broken_limits_le:
- data://garden/demography/2023-10-03/life_tables
- data://garden/hmd/2023-09-19/hmd
data://grapher/demography/2023-10-20/broken_limits_le:
- data://garden/demography/2023-10-20/broken_limits_le

# Contribution to sex gap in Life Expectancy
data://meadow/demography/2023-11-08/le_sex_gap_age_contribution:
- snapshot://demography/2023-11-08/le_sex_gap_age_contribution.zip
Expand All @@ -210,18 +219,6 @@ steps:
data://grapher/demography/2023-11-08/modal_age_death:
- data://garden/demography/2023-11-08/modal_age_death

# Maddison working paper (2022)
data://meadow/ggdc/2024-01-19/maddison_federico_paper:
- snapshot://ggdc/2024-01-19/maddison_federico_paper.xlsx
data://garden/ggdc/2024-01-19/maddison_federico_paper:
- data://meadow/ggdc/2024-01-19/maddison_federico_paper

# UN WPP experiments
data://garden/un/2024-03-14/un_wpp_most:
- data://garden/un/2022-07-11/un_wpp
data://grapher/un/2024-03-14/un_wpp_most:
- data://garden/un/2024-03-14/un_wpp_most

########################################################################
# Fertility #
########################################################################
Expand All @@ -247,9 +244,3 @@ steps:
- data://meadow/demography/2024-11-26/multiple_births
data://grapher/demography/2024-11-26/multiple_births:
- data://garden/demography/2024-11-26/multiple_births

# Survivorship ages (HMD-derived)
data://garden/demography/2024-12-02/survivor_percentiles:
- data://garden/hmd/2024-12-01/hmd
data://grapher/demography/2024-12-02/survivor_percentiles:
- data://garden/demography/2024-12-02/survivor_percentiles
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
# NOTE: To learn more about the fields, hover over their names.
definitions:
common:
presentation:
topic_tags:
- Life Expectancy


# Learn more about the available fields:
# http://docs.owid.io/projects/etl/architecture/metadata/reference/dataset/
dataset:
title: "Life Expectancy: Broken limits"
update_period_days: 365


# Learn more about the available fields:
# http://docs.owid.io/projects/etl/architecture/metadata/reference/tables/
tables:
broken_limits_le:
variables:
life_expectancy:
title: &le_name Maximum life expectancy
unit: years
description_short: |-
<%- if (sex == 'female') -%>
Maximum life expectancy recorded in a given year (among females).
<%- elif (sex == 'male') -%>
Maximum life expectancy recorded in a given year (among males).
<%- elif (sex == 'all') -%>
Maximum life expectancy recorded in a given year.
<%- endif -%>
description_key:
- Period life expectancy is a metric that summarizes death rates across all age groups in one particular year. For a given year, it represents the average lifespan for a hypothetical group of people, if they experienced the same age-specific death rates throughout their lives as the age-specific death rates seen in that particular year.
- Records are only shown for countries in the Human Mortality Database. Prior to 1950, we use HMD (2023) data. From 1950 onwards, we use UN WPP (2022) data.
display:
name: *le_name
presentation:
title_public: *le_name
title_variant: ""
attribution_short: HMD; UN WPP
topic_tags:
- Life Expectancy
grapher_config:
hasMapTab: true

country_with_max_le:
title: Country with yearly maximum life expectancy
unit: ""
description_short: |-
Name of the country with the yearly maximum life expectancy registered<%- if (sex == 'female') %> among females<% elif (sex == 'male') %> among males<% endif -%>.
description_processing: This indicator is meant to be used as an auxiliary indicator.
76 changes: 76 additions & 0 deletions etl/steps/data/garden/demography/2024-12-03/broken_limits_le.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
"""Load a meadow dataset and create a garden dataset.

We only consider data from countries that are present in HMD. And, additionally, we only consider entries for these countries since the year they first appear in the HMD dataset (even if for that period we use UN WPP data, i.e. post-1950)
"""

from owid.catalog import Table

from etl.helpers import PathFinder, create_dataset

# Get paths and naming conventions for current step.
paths = PathFinder(__file__)
# Year to start tracking. Note that in the first years, few countries have data. Hence, we start in a later year, where more countries have data.
YEAR_FIRST = 1840


def run(dest_dir: str) -> None:
#
# Load inputs.
#
# Load meadow dataset.
ds_meadow = paths.load_dataset("life_tables")
ds_hmd = paths.load_dataset("hmd")

# Read table from meadow dataset.
tb = ds_meadow.read("life_tables", reset_index=False)
tb_hmd = ds_hmd.read("life_tables")

#
# Process data.
#
# Filter relevant dimensions
tb = tb.loc[(slice(None), slice(None), slice(None), "0", "period"), ["life_expectancy"]].reset_index()

# Keep relevant columns and rows
tb = tb.drop(columns=["type", "age"]).dropna()

# Rename column
tb = tb.rename(columns={"location": "country"})

# Get country-sex and first year of LE reported in HMD
tb_hmd = get_first_year_of_country_in_hmd(tb_hmd)

# Only preserve countries coming from HDM
tb = tb.merge(tb_hmd, on=["country", "sex"], suffixes=("", "_min"))
tb = tb[tb["year"] >= tb["year_min"]].drop(columns=["year_min"])

# Get max for each year
tb = tb.loc[tb.groupby(["year", "sex"], observed=True)["life_expectancy"].idxmax()]

# Organise columns
tb["country_with_max_le"] = tb["country"]
tb["country"] = tb["country"] + " " + tb["year"].astype("string")

# First year
tb = tb[tb["year"] >= YEAR_FIRST]

# Set index
tb = tb.format(["country", "year", "sex"], short_name="broken_limits_le")

#
# Save outputs.
#
# Create a new garden dataset with the same metadata as the meadow dataset.
ds_garden = create_dataset(
dest_dir, tables=[tb], check_variables_metadata=True, default_metadata=ds_meadow.metadata
)

# Save changes in the new garden dataset.
ds_garden.save()


def get_first_year_of_country_in_hmd(tb_hmd: Table) -> Table:
tb_hmd = tb_hmd.loc[(tb_hmd["type"] == "period") & (tb_hmd["age"] == "0")]
tb_hmd = tb_hmd.loc[:, ["country", "year", "sex", "life_expectancy"]].dropna()
tb_hmd = tb_hmd.groupby(["country", "sex"], observed=True, as_index=False)["year"].min()
return tb_hmd
Loading
Loading