-
Notifications
You must be signed in to change notification settings - Fork 834
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Synthetic difference in differences #2095
Conversation
Signed-off-by: Jason Wang <[email protected]>
Signed-off-by: Jason Wang <[email protected]>
Hey @memoryz 👋! We use semantic commit messages to streamline the release process. Examples of commit messages with semantic prefixes:
To test your commit locally, please follow our guild on building from source. |
core/src/main/scala/com/microsoft/azure/synapse/ml/codegen/Wrappable.scala
Show resolved
Hide resolved
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
Signed-off-by: Jason Wang <[email protected]>
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run |
} | ||
|
||
override def copy(extra: ParamMap): DiffInDiffModel = { | ||
copyValues(new DiffInDiffModel(uid)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you dont seem to handle extra, you can use defaultcopy i think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! I'm now passing extra
to copyValues
. The effect is same as defaultCopy
.
|
||
trait HasUnitCol extends Params { | ||
final val unitCol = new Param[String](this, "unitCol", | ||
"Column that identifies the units in panel data") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you expand on this doc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
lossHistoryUnitWeights = Some(lossHistoryUnitWeights.toList) | ||
) | ||
|
||
copyValues(new DiffInDiffModel(this.uid)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do you wrap in a copy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In order to copy the shared params. The DiffInDiffModel
and SyntheticDiffInDiffEstimator
share these two params: unitCol
and timeCol
. Wrapping in a copyValues will make sure the values of these two params are copied over to the DiffInDiffModel
instance.
// Spark mode and breeze mode should get same loss history and same solution | ||
// .setLocalSolverThreshold(1) | ||
|
||
test("SyntheticDiffInDiffEstimator can estimate the treatment effect") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we test both spark and breeze fitting branches?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I implemented the unit test for the linalg routine for both breeze mode and Spark mode, and since the MirrorDescent code is abstracted on top of linalg routines, I didn't implement the unit tests for the Spark fitting branch specifically. But I've just added a unit test for that with latest commit.
docs/Explore Algorithms/Causal Inference/Quickstart - Synthetic difference in differences.ipynb
Outdated
Show resolved
Hide resolved
"name": "python" | ||
}, | ||
"save_output": true, | ||
"synapse_widget": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you remove the output so it doesent blow up size of NB?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few minor comments, thank you so much for this lovely contribution :)
…c difference in differences.ipynb Co-authored-by: Mark Hamilton <[email protected]>
/azp run |
@mhamilton723 I've addressed all the comments so far. Ready for review. |
/azp run |
/azp run |
2 similar comments
/azp run |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
@mhamilton723 can you merge this PR? The failing unit tests have nothing to do with this change. |
/azp run |
Azure Pipelines successfully started running 1 pipeline(s). |
What changes are proposed in this pull request?
This PR contains Spark implementation for 3 causal estimation methods: difference in differences, synthetic control and synthetic difference in differences.
Additional contributors to this PR:
@sarahshy @andrewnaber-msft
How is this patch tested?
Does this PR change any dependencies?
Does this PR add a new feature? If so, have you added samples on website?
website/docs/documentation
folder.Make sure you choose the correct class
estimators/transformers
and namespace.DocTable
points to correct API link.yarn run start
to make sure the website renders correctly.<!--pytest-codeblocks:cont-->
before each python code blocks to enable auto-tests for python samples.WebsiteSamplesTests
job pass in the pipeline.