generated from carpentries/workbench-template-md
-
-
Notifications
You must be signed in to change notification settings - Fork 4
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
f3d84ea
commit ff00c83
Showing
12 changed files
with
176 additions
and
121 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -55,12 +55,12 @@ contact: '[email protected]' | |
|
||
# Order of episodes in your lesson | ||
episodes: | ||
- introduction.md | ||
- problem-definition.md | ||
- scientific-validity.md | ||
- fairness.md | ||
- explainability.md | ||
- releasing-a-model.md | ||
- 0-introduction.md | ||
- 1-preparing-to-train.md | ||
- 2-model-fitting.md | ||
- 3-model-eval.md | ||
- 4-explainability.md | ||
- 5-releasing-a-model.md | ||
|
||
# Information for Learners | ||
learners: | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,31 @@ | ||
--- | ||
title: "Preparing to train a model" | ||
teaching: 0 | ||
exercises: 0 | ||
--- | ||
|
||
:::::::::::::::::::::::::::::::::::::: questions | ||
|
||
- For what prediction tasks is machine learning an appropriate tool? | ||
- How can inappropriate target variable choice lead to suboptimal outcomes in a machine learning pipeline? | ||
- What is "biased" training data, and where does this bias come from? | ||
|
||
:::::::::::::::::::::::::::::::::::::::::::::::: | ||
|
||
::::::::::::::::::::::::::::::::::::: objectives | ||
|
||
- Judge what tasks are appropriate for machine learning | ||
- Understand why the choice of prediction task / target variable is important. | ||
- Describe how bias can appear in training data. | ||
|
||
|
||
:::::::::::::::::::::::::::::::::::::::::::::::: | ||
|
||
|
||
::::::::::::::::::::::::::::::::::::: keypoints | ||
|
||
- Some tasks are not appropriate for machine learning due to ethical concerns. | ||
- Machine learning tasks should have a valid prediction target that maps clearly to the real-world goal. | ||
- Training data can be biased due to societal inequities, errors in the data collection process, and lack of attention to careful sampling practices. | ||
|
||
:::::::::::::::::::::::::::::::::::::::::::::::: |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,28 @@ | ||
--- | ||
title: "Scientific Validity in the Modeling Process" | ||
teaching: 0 | ||
exercises: 0 | ||
--- | ||
|
||
:::::::::::::::::::::::::::::::::::::: questions | ||
|
||
- What impact does overfitting and underfitting have on model performance? | ||
- What is data leakage? | ||
|
||
:::::::::::::::::::::::::::::::::::::::::::::::: | ||
|
||
::::::::::::::::::::::::::::::::::::: objectives | ||
|
||
- Implement at least two types of machine learning models in Python. | ||
- Describe the risks of, identify, and understand mitigation steps for overfitting and underfitting. | ||
- Understand why data leakage is harmful to scientific validity and how it can appear in machine learning pipelines. | ||
|
||
:::::::::::::::::::::::::::::::::::::::::::::::: | ||
|
||
::::::::::::::::::::::::::::::::::::: keypoints | ||
|
||
- Overfitting is characterized by worse performance on the test set than on the train set and can be fixed by switching to a simpler model architecture or by adding regularization. | ||
- Underfitting is characterized by poor performance on both the training and test datasets. It can be fixed by collecting more training data, switching to a more complex model architecture, or improving feature quality. | ||
- Data leakage occurs when the model has access to the test data during training and results in overconfidence in the model's performance. | ||
|
||
:::::::::::::::::::::::::::::::::::::::::::::::: |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,53 @@ | ||
--- | ||
title: "Fairness" | ||
teaching: 0 | ||
exercises: 0 | ||
--- | ||
|
||
:::::::::::::::::::::::::::::::::::::: questions | ||
|
||
- How do we define fairness and bias in machine learning outcomes? | ||
- How can we improve the fairness of machine learning models? | ||
|
||
:::::::::::::::::::::::::::::::::::::::::::::::: | ||
|
||
::::::::::::::::::::::::::::::::::::: objectives | ||
- Reason about model performance through standard evaluation metrics. | ||
- Understand and distinguish between various notions of fairness in machine learning. | ||
- Describe and implement two different ways of modifying the machine learning modeling process to improve the fairness of a model. | ||
|
||
:::::::::::::::::::::::::::::::::::::::::::::::: | ||
|
||
:::::::::::::::::::::::::::::::::::::: challenge | ||
|
||
### Matching fairness terminology with definitions | ||
|
||
Match the following types of formal fairness with their definitions. | ||
(A) Individual fairness, | ||
(B) Equalized odds, | ||
(C) Demographic parity, and | ||
(D) Group-level calibration | ||
|
||
1. The model is equally accurate across all demographic groups. | ||
2. Different demographic groups have the same true positive rates and false positive rates. | ||
3. Similar people are treated similarly. | ||
4. People from different demographic groups receive each outcome at the same rate. | ||
:::::::::::::::::::::::::::::::::::::::::::::::::: | ||
|
||
:::::::::::::: solution | ||
|
||
### Solution | ||
|
||
A - 3, B - 2, C - 4, D - 1 | ||
|
||
::::::::::::::::::::::::: | ||
|
||
|
||
::::::::::::::::::::::::::::::::::::: keypoints | ||
|
||
- It's important to consider many dimensions of model performance: a single accuracy score is not sufficient. | ||
- There is no single definition of "fair machine learning": different notions of fairness are appropriate in different contexts. | ||
- It is usually not possible to satisfy all possible notions of fairness. | ||
- The fairness of a model can be improved by using techniques like data reweighting and model postprocessing. | ||
|
||
:::::::::::::::::::::::::::::::::::::::::::::::: |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,27 @@ | ||
--- | ||
title: "Explainability" | ||
teaching: 0 | ||
exercises: 0 | ||
--- | ||
|
||
:::::::::::::::::::::::::::::::::::::: questions | ||
|
||
- What is model interpretability? When do we need models to be interpretable? | ||
- What are some model interpretability techniques? | ||
- When are model explainability techniques not sufficient for understanding model behavior? | ||
|
||
:::::::::::::::::::::::::::::::::::::::::::::::: | ||
|
||
::::::::::::::::::::::::::::::::::::: objectives | ||
|
||
- Compare and contrast different interpretability techniques. | ||
- Explain feature importance. | ||
- Articulate limitations of explainable machine learning. | ||
|
||
:::::::::::::::::::::::::::::::::::::::::::::::: | ||
|
||
::::::::::::::::::::::::::::::::::::: keypoints | ||
|
||
- TODO | ||
|
||
:::::::::::::::::::::::::::::::::::::::::::::::: |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,30 @@ | ||
--- | ||
title: "Releasing a model" | ||
teaching: 0 | ||
exercises: 0 | ||
--- | ||
|
||
:::::::::::::::::::::::::::::::::::::: questions | ||
|
||
- What is distribution shift? How can we know if distribution shift has occured? | ||
- What is a model card? | ||
- How do I share a model so that others may use it? | ||
|
||
:::::::::::::::::::::::::::::::::::::::::::::::: | ||
|
||
::::::::::::::::::::::::::::::::::::: objectives | ||
|
||
- Understand distribution shift and its implications. | ||
- Apply model-sharing best practices through using model cards. | ||
- Understand the technical and communication norms around sharing models. | ||
|
||
:::::::::::::::::::::::::::::::::::::::::::::::: | ||
|
||
::::::::::::::::::::::::::::::::::::: keypoints | ||
|
||
- Distribution shift is common. It can be caused by temporal shifts (i.e., using old training data) or by applying a model to new populations. | ||
- Distribution shift can be addressed by TODO | ||
- Model cards are the standard technique for communicating information about how machine learning systems were trained and how they should and should not be used. | ||
- Models can be shared and reused by doing TODO | ||
|
||
:::::::::::::::::::::::::::::::::::::::::::::::: |
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.
This file was deleted.
Oops, something went wrong.