Add confidence interval causal curves #231
Open
+163
−12
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Status
READY
Todo list
Background context
We decided to include a way to calculate the errors of the Cumulative Effect and Cumulative Gain Curves following the example presented in Causal Inference for the Brave and the True.
Description of the changes proposed in the pull request
We add a new type called
error_fn
which intends to be a general class of statistical error functions, we implement one function of this kind which is thelinear_standard_error
. We use this function to generate a curve function:cumulative_statistical_error_curve
, analogous tocumulative_gain_curve
andcumulative_effect_curve
, that calculates the error (given by the error_fn) among a treatment and an outcome taking into account incremental pieces of an ordered dataframe. At the end we modify theeffect_curves
function to add an optional parameter in case one wishes to calculate the error of the cumulative gain curve and the cumulative effect curve. These error columns are intended to be used to generate Confidence Intervals of these curves.Where should the reviewer start?
We suggest to start from the
causal/validation/curves.py
file, then check thecausal/statistical_errors.py
file.Remaining problems or questions
We only wrote a function for linear relationships but we believe we did a general enough approach so it can be extended to other kinds of relationships.