You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
A variety of topics came up re propensity.py in the last CausalML call:
compute_propensity_score could either:
enforce that the p_model arg is an instance of the PropensityModel class
if so, then alter the Propensity model class predict and predict_proba functions so that predict calls predict_proba if the underlying model doesn't have a predict function that produces scores (and not classifications)
accept any model passed (for instance, a Naive Bayes classifier)
if so, then call predict_proba or predict based on the availability of predict_proba from within compute_propensity_score for the passed model, so as to always produce the propensity scores and not the classification
when calling compute_propensity_score with calibrate=True, the model that is returned is not the model that produces the returned propensity scores (those are calculated by calibration function model). This seems like it could lead to confusion if someone were to attempt to reproduce the scores using the returned model.
Would it make sense here to return None for the model? Or some other warning if calibrate=True?
Describe the solution you'd like
Not sure about the tradeoffs between accepting any model in compute_propensity_score vs requiring that the model be an instance of PropensityModel, but I'd be inclined to accept any model (i.e. a Naive Bayes classifier as was used in the calibration example) and then test for the existence of predict_proba to produce scores, before using predict if predict_proba doesn't exist. Something like:
def compute_propensity_score(
X, treatment, p_model=None, X_pred=None, treatment_pred=None, calibrate_p="iso"
):
"""Generate propensity score if user didn't provide
Args:
X (np.matrix): features for training
treatment (np.array or pd.Series): a treatment vector for training
p_model (propensity model object, optional):
ElasticNetPropensityModel (default) / GradientBoostedPropensityModel
X_pred (np.matrix, optional): features for prediction
treatment_pred (np.array or pd.Series, optional): a treatment vector for prediciton
calibrate_p (bool, optional): whether calibrate the propensity score
Returns:
(tuple)
- p (numpy.ndarray): propensity score
- p_model (PropensityModel): a trained PropensityModel object
"""
print("using local compute_propensity_score")
if treatment_pred is None:
treatment_pred = treatment.copy()
if p_model is None:
p_model = ElasticNetPropensityModel()
p_model.fit(X, treatment)
if X_pred is None:
try:
p = p_model.predict_proba(X)[:, 1]
except AttributeError:
print("predict_proba not available, using predict instead")
p = p_model.predict(X)
else:
try:
p = p_model.predict_proba(X_pred)[:, 1]
except AttributeError:
print("predict_proba not available, using predict instead")
p = p_model.predict(X_pred)
if calibrate_p:
print("Isotonic calibrating propensity scores only. Returning model=None.")
p = calibrate_iso(p, treatment_pred)
p_model = None
# force the p values within the range
eps = np.finfo(float).eps
p = np.where(p < 0 + eps, 0 + eps * 1.001, p)
p = np.where(p > 1 - eps, 1 - eps * 1.001, p)
return p, p_model
Is your feature request related to a problem? Please describe.
A variety of topics came up re
propensity.py
in the last CausalML call:compute_propensity_score
could either:p_model
arg is an instance of the PropensityModel classpredict
andpredict_proba
functions so thatpredict
callspredict_proba
if the underlying model doesn't have apredict
function that produces scores (and not classifications)predict_proba
orpredict
based on the availability ofpredict_proba
from withincompute_propensity_score
for the passed model, so as to always produce the propensity scores and not the classificationwhen calling
compute_propensity_score
withcalibrate=True
, the model that is returned is not the model that produces the returned propensity scores (those are calculated by calibration function model). This seems like it could lead to confusion if someone were to attempt to reproduce the scores using the returned model.None
for the model? Or some other warning ifcalibrate=True
?Describe the solution you'd like
Not sure about the tradeoffs between accepting any model in
compute_propensity_score
vs requiring that the model be an instance ofPropensityModel
, but I'd be inclined to accept any model (i.e. a Naive Bayes classifier as was used in the calibration example) and then test for the existence ofpredict_proba
to produce scores, before usingpredict
ifpredict_proba
doesn't exist. Something like:Refs:
causalml/causalml/propensity.py
Line 14 in 22b13c2
causalml/causalml/propensity.py
Line 201 in 22b13c2
The text was updated successfully, but these errors were encountered: