Skip to content

Latest commit

 

History

History
121 lines (110 loc) · 3.6 KB

Association, Bias & Causation.md

File metadata and controls

121 lines (110 loc) · 3.6 KB
created modified tags type status
2024-06-20 15:50
2024-06-21 21:08
#causality
#statistics
#probability
#bias
note
completed

The (average) difference between the true causal effect and observed effect of a treatment can be elegantly partitioned into the causal effect and selection bias:

$$\begin{array}{lcl} \underbrace{E\Big[Y\Bigl|T=1\Big] - E\Big[Y\Bigl|T=0\Big]}{ \substack{ \text{Difference between} \ \text{treatment group means} } } &=& \underbrace{E\Big[Y(1)-Y(0)\Bigl|T=1\Big]}{ \substack{\text{Average Treatment effect} \ \text{on the Treated (ATT)} }} + \underbrace{\Bigg(E\Big[Y(0)\Bigl|T=1\Big]-E\Big[Y(0)\Bigl|T=0\Big]\Bigg)}_{ \text{Selection Bias} } \ \space &\space& \space \ Y_i &=& \text{outcome of interest (on individual } i)\ T_i &=& \begin{cases}1 \quad \text{if individual } i \text{ received treatment} \ 0 \quad \text{if individual } i \text{ did not receive treatment}\end{cases} \ Y_i(1) &=& \text{outcome which would have been observed for individual } i \text{ if they had received the treatment} \ Y_i(0) &=& \text{outcome which would have been observed for individual } i \text{ if they had NOT received the treatment} \ \end{array}$$

Here is a simulation in python showing this to be true:

import random
import statistics

N_INDIVIDUALS: int = 100_000
random.seed(69)

untreated_prob_of_dying: list[float] = [
    random.uniform(0, 1) for _ in range(N_INDIVIDUALS)
]
treated_prob_of_dying: list[float] = [
    # treatment halves probability of death #
    0.5 * p
    for p in untreated_prob_of_dying
]
assigned_treatment_group: list[str] = [
    # biased by probability of dying #
    random.choices(["treated", "untreated"], weights=(p, 1 - p))[0]
    for p in untreated_prob_of_dying
]
prob_of_dying: list[float] = [
    (
        untreated_prob_of_dying[idx]
        if treat_grp == "untreated"
        else treated_prob_of_dying[idx]
    )
    for idx, treat_grp in enumerate(assigned_treatment_group)
]
mean_prob_of_dying_treated_group: float = statistics.mean(
    [
        prob_of_dying[idx]
        for idx, treat_grp in enumerate(assigned_treatment_group)
        if treat_grp == "treated"
    ]
)
mean_prob_of_dying_untreated_group: float = statistics.mean(
    [
        prob_of_dying[idx]
        for idx, treat_grp in enumerate(assigned_treatment_group)
        if treat_grp == "untreated"
    ]
)
att: float = statistics.mean(
    [
        (treated_prob_of_dying[idx] - untreated_prob_of_dying[idx])
        for idx, treat_grp in enumerate(assigned_treatment_group)
        if treat_grp == "treated"
    ]
)
selection_bias: float = statistics.mean(
    [
        untreated_prob_of_dying[idx]
        for idx, treat_grp in enumerate(assigned_treatment_group)
        if treat_grp == "treated"
    ]
) - statistics.mean(
    [
        untreated_prob_of_dying[idx]
        for idx, treat_grp in enumerate(assigned_treatment_group)
        if treat_grp == "untreated"
    ]
)

print(
    f"""
                      E[Y|T=1] - E[Y|T=0] = {(mean_prob_of_dying_treated_group - mean_prob_of_dying_untreated_group):.5f}
                     ATT + selection_bias = {(att + selection_bias):.5f}

                    ATT: E[Y(1)-Y(0)|T=1] = {att:.5f}
Selection Bias: E[Y(0)|T=1] - E[Y(0)|T=0] = {selection_bias:.5f}
"""
)
                      E[Y|T=1] - E[Y|T=0] = 0.00176
                     ATT + selection_bias = 0.00176

                    ATT: E[Y(1)-Y(0)|T=1] = -0.33341
Selection Bias: E[Y(0)|T=1] - E[Y(0)|T=0] = 0.33516

References

  • [[Causal Inference for The Brave and True]]

Related

  • [[Causal Inference]]