diff --git a/AC2/python-debiased-ml-for-partially-linear-iv-model.ipynb b/AC2/python-debiased-ml-for-partially-linear-iv-model.ipynb
index 9dd4dce8..b882ddb6 100644
--- a/AC2/python-debiased-ml-for-partially-linear-iv-model.ipynb
+++ b/AC2/python-debiased-ml-for-partially-linear-iv-model.ipynb
@@ -1,625 +1,637 @@
 {
-  "cells": [
-    {
-      "cell_type": "markdown",
-      "metadata": {
-        "papermill": {
-          "duration": 0.008106,
-          "end_time": "2021-04-23T10:41:33.911944",
-          "exception": false,
-          "start_time": "2021-04-23T10:41:33.903838",
-          "status": "completed"
-        },
-        "tags": [],
-        "id": "qjaFmAyEzcLz"
-      },
-      "source": [
-        "# Double/Debiased ML for Partially Linear IV Model\n",
-        "\n",
-        "This is a simple implementation of Debiased Machine Learning for the Partially Linear\n",
-        "IV Regression Model, which provides an application of DML IV inference.\n",
-        "\n",
-        "\n",
-        "Reference:\n",
-        "\n",
-        "- https://arxiv.org/abs/1608.00060\n",
-        "- https://www.amazon.com/Business-Data-Science-Combining-Accelerate/dp/1260452778\n",
-        "\n",
-        "The code is based on the book.\n"
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {
-        "papermill": {
-          "duration": 0.006963,
-          "end_time": "2021-04-23T10:41:33.926085",
-          "exception": false,
-          "start_time": "2021-04-23T10:41:33.919122",
-          "status": "completed"
-        },
-        "tags": [],
-        "id": "G3YNyr-ezcL4"
-      },
-      "source": [
-        "\n",
-        "# Partially Linear IV Model\n",
-        "\n",
-        "We consider the partially linear structural equation model:\n",
-        "\\begin{eqnarray}\n",
-        " &  Y - D\\theta_0 = g_0(X) + \\zeta,  & E[\\zeta \\mid Z,X]= 0,\\\\\n",
-        "  & Z = m_0(X) +  V,   &  E[V \\mid X] = 0.\n",
-        "\\end{eqnarray}\n",
-        "\n",
-        "\n",
-        "Note that this model is not a regression model unless $Z=D$.  The model  is a canonical\n",
-        "model in causal inference, going back to P. Wright's work on IV methods for estimating demand/supply equations, with the modern difference being that $g_0$ and $m_0$ are nonlinear, potentially complicated functions of high-dimensional $X$.  \n",
-        "\n",
-        "\n",
-        "The idea of this model is that there is a structural or causal relation between $Y$ and $D$, captured by $\\theta_0$, and $g_0(X) + \\zeta$ is the stochastic error, partly explained by covariates $X$.  $V$ and $\\zeta$ are stochastic errors that are not explained by $X$. Since $Y$ and $D$ are jointly determined, we need an external factor, commonly referred to as an instrument, $Z$ to create exogenous variation\n",
-        "in $D$.   Note that $Z$ should affect $D$.  The $X$ here serve again as confounding factors, so we can think of variation in $Z$ as being exogenous only conditional on $X$.\n",
-        "\n",
-        "\n",
-        "The causal DAG this model corresponds to is given by:\n",
-        "$$\n",
-        "Z \\to D,  X \\to (Y, Z, D),  L \\to (Y,D),\n",
-        "$$\n",
-        "where $L$ is the latent confounder affecting both $Y$ and $D$, but not $Z$.\n",
-        "\n",
-        "\n",
-        "\n",
-        "---\n",
-        "\n",
-        "# Example\n",
-        "\n",
-        "A simple contextual example is from biostatistics, where $Y$ is a health outcome and $D$ is indicator of smoking.  Thus, $\\theta_0$ is captures the effect of smoking on health.  Health outcome $Y$ and smoking behavior $D$ are treated as being jointly determined.  $X$ represents patient characteristics, and $Z$ could be a doctor's advice not to smoke (or another behavioral treatment) that may affect the outcome $Y$ only through shifting the behavior $D$, conditional on characteristics $X$.   \n",
-        "\n",
-        "----\n",
-        "\n",
-        "\n",
-        "\n",
-        "# PLIVM in Residualized Form\n",
-        "\n",
-        "The PLIV model above can be rewritten in the following residualized form:\n",
-        "$$\n",
-        "  \\tilde Y = \\tilde D \\theta_0 + \\zeta,   \\quad  E[\\zeta \\mid V,X]= 0,\n",
-        "$$\n",
-        "where\n",
-        "$$\n",
-        " \\tilde Y = (Y- \\ell_0(X)),  \\quad \\ell_0(X) = E[Y \\mid X] \\\\\n",
-        "   \\tilde D = (D - r_0(X)), \\quad r_0(X) = E[D \\mid X] \\\\\n",
-        "   \\tilde Z = (Z- m_0(X)), \\quad m_0(X) = E[Z \\mid X].\n",
-        "$$\n",
-        "   The \"tilde\" variables (e.g. $\\tilde Y$) above represent original variables after taking out or \"partialling out\"\n",
-        "  the effect of $X$.  Note that $\\theta_0$ is identified from this equation if $V$\n",
-        "  and $U$ have non-zero correlation, which automatically means that $U$ and $V$\n",
-        "  must have non-zero variation.\n",
-        "\n",
-        "  \n",
-        "\n",
-        "-----\n",
-        "\n",
-        "# DML for PLIV Model\n",
-        "\n",
-        "Given identification, DML  proceeds as follows\n",
-        "\n",
-        "Compute the estimates $\\hat \\ell_0$, $\\hat r_0$, and $\\hat m_0$ , which amounts\n",
-        "to solving the three problems of predicting $Y$, $D$, and $Z$ using\n",
-        "$X$, using any generic  ML method, giving us estimated residuals\n",
-        "$$\n",
-        "\\tilde Y = Y - \\hat \\ell_0(X), \\\\ \\tilde D= D - \\hat r_0(X), \\\\ \\tilde Z = Z- \\hat m_0(X).\n",
-        "$$\n",
-        "The estimates should be of a cross-validated form, as detailed in the algorithm below.\n",
-        "\n",
-        "Estimate $\\theta_0$ by the the intstrumental\n",
-        "variable regression of $\\tilde Y$ on $\\tilde D$ using $\\tilde Z$ as an instrument.\n",
-        "Use the conventional inference for the IV regression estimator, ignoring\n",
-        "the estimation error in these residuals.\n",
-        "\n",
-        "The reason we work with this residualized form is that it eliminates the bias\n",
-        "arising when solving the prediction problem in stage 1. The role of cross-validation\n",
-        "is to avoid another source of bias due to potential overfitting.\n",
-        "\n",
-        "The estimator is adaptive,\n",
-        "in the sense that the first stage estimation errors do not affect the second\n",
-        "stage errors.\n"
-      ]
-    },
-    {
-      "cell_type": "code",
-      "execution_count": null,
-      "metadata": {
-        "_kg_hide-output": true,
-        "execution": {
-          "iopub.execute_input": "2021-04-23T10:41:33.973149Z",
-          "iopub.status.busy": "2021-04-23T10:41:33.971090Z",
-          "iopub.status.idle": "2021-04-23T10:42:08.602961Z",
-          "shell.execute_reply": "2021-04-23T10:42:08.601638Z"
-        },
-        "papermill": {
-          "duration": 34.670095,
-          "end_time": "2021-04-23T10:42:08.603197",
-          "exception": false,
-          "start_time": "2021-04-23T10:41:33.933102",
-          "status": "completed"
-        },
-        "tags": [],
-        "id": "yGW6JhG5zcL5"
-      },
-      "outputs": [],
-      "source": [
-        "# Import relevant packages\n",
-        "import pandas as pd\n",
-        "import numpy as np\n",
-        "import matplotlib.pyplot as plt\n",
-        "from sklearn.model_selection import train_test_split\n",
-        "from sklearn.model_selection import cross_val_score, cross_val_predict, KFold\n",
-        "from sklearn.pipeline import make_pipeline\n",
-        "from sklearn.linear_model import LassoCV, RidgeCV, ElasticNetCV, LinearRegression, Ridge, Lasso, LogisticRegressionCV\n",
-        "from sklearn.preprocessing import StandardScaler\n",
-        "from sklearn.ensemble import GradientBoostingRegressor, GradientBoostingClassifier\n",
-        "from sklearn.ensemble import RandomForestRegressor, RandomForestClassifier\n",
-        "from sklearn.tree import DecisionTreeRegressor, DecisionTreeClassifier\n",
-        "import patsy\n",
-        "import warnings\n",
-        "from sklearn.base import BaseEstimator, clone\n",
-        "import statsmodels.api as sm\n",
-        "from statsmodels.tools import add_constant\n",
-        "from IPython.display import Markdown\n",
-        "import os\n",
-        "import seaborn as sns\n",
-        "warnings.simplefilter('ignore')\n",
-        "np.random.seed(1234)"
-      ]
-    },
-    {
-      "cell_type": "code",
-      "execution_count": null,
-      "metadata": {
-        "execution": {
-          "iopub.execute_input": "2021-04-23T10:42:08.664371Z",
-          "iopub.status.busy": "2021-04-23T10:42:08.629661Z",
-          "iopub.status.idle": "2021-04-23T10:42:10.458175Z",
-          "shell.execute_reply": "2021-04-23T10:42:10.456976Z"
-        },
-        "papermill": {
-          "duration": 1.846109,
-          "end_time": "2021-04-23T10:42:10.458406",
-          "exception": false,
-          "start_time": "2021-04-23T10:42:08.612297",
-          "status": "completed"
-        },
-        "tags": [],
-        "id": "j2WVUbBDzcL-"
-      },
-      "outputs": [],
-      "source": [
-        "def dml(X, Z, D, y, modely, modeld, modelz, *, nfolds, classifier=False):\n",
-        "    '''\n",
-        "    DML for the Partially Linear Model setting with cross-fitting\n",
-        "\n",
-        "    Input\n",
-        "    -----\n",
-        "    X: the controls\n",
-        "    Z: the instrument\n",
-        "    D: the treatment\n",
-        "    y: the outcome\n",
-        "    modely: the ML model for predicting the outcome y\n",
-        "    modeld: the ML model for predicting the treatment D\n",
-        "    modelz: the ML model for predicting the instrument Z\n",
-        "    nfolds: the number of folds in cross-fitting\n",
-        "    classifier: bool, whether the modeld is a classifier or a regressor\n",
-        "\n",
-        "    Output\n",
-        "    ------\n",
-        "    point: the point estimate of the treatment effect of D on y\n",
-        "    stderr: the standard error of the treatment effect\n",
-        "    yhat: the cross-fitted predictions for the outcome y\n",
-        "    Dhat: the cross-fitted predictions for the treatment D\n",
-        "    Zhat: the cross-fitted predictions for the instrument Z\n",
-        "    resy: the outcome residuals\n",
-        "    resD: the treatment residuals\n",
-        "    resZ: the instrument residuals\n",
-        "    epsilon: the final residual-on-residual OLS regression residual\n",
-        "    '''\n",
-        "    cv = KFold(n_splits=nfolds, shuffle=True, random_state=123) # shuffled k-folds\n",
-        "    yhat = cross_val_predict(modely, X, y, cv=cv, n_jobs=-1) # out-of-fold predictions for y\n",
-        "    # out-of-fold predictions for D\n",
-        "    # use predict or predict_proba dependent on classifier or regressor for D\n",
-        "    if classifier:\n",
-        "        Dhat = cross_val_predict(modeld, X, D, cv=cv, method='predict_proba', n_jobs=-1)[:, 1]\n",
-        "        Zhat = cross_val_predict(modelz, X, Z, cv=cv, method='predict_proba', n_jobs=-1)[:, 1]\n",
-        "    else:\n",
-        "        Dhat = cross_val_predict(modeld, X, D, cv=cv, n_jobs=-1)\n",
-        "        Zhat = cross_val_predict(modelz, X, Z, cv=cv, n_jobs=-1)\n",
-        "    # calculate outcome and treatment residuals\n",
-        "    resy = y - yhat\n",
-        "    resD = D - Dhat\n",
-        "    resZ = Z - Zhat\n",
-        "    # rename for downstream tasks\n",
-        "    resy = resy.rename(\"resy\")\n",
-        "    resD = resD.rename(\"resD\")\n",
-        "    resZ = resZ.rename(\"resZ\")\n",
-        "\n",
-        "    # final stage ols based point estimate and standard error\n",
-        "    point = np.mean(resy * resZ) / np.mean(resD*resZ)\n",
-        "    epsilon = resy - point * resD\n",
-        "    var = np.mean(epsilon**2 * resZ**2) / np.mean(resD*resZ)**2\n",
-        "    stderr = np.sqrt(var / X.shape[0])\n",
-        "    return point, stderr, yhat, Dhat, Zhat, resy, resD, resZ, epsilon"
-      ]
-    },
-    {
-      "cell_type": "code",
-      "source": [
-        "def summary(point, stderr, yhat, Dhat, Zhat, resy, resD, resZ, epsilon, X, D, Z, y, *, name):\n",
-        "    '''\n",
-        "    Convenience summary function that takes the results of the DML IV function\n",
-        "    and summarizes several estimation quantities and performance metrics.\n",
-        "    '''\n",
-        "    return pd.DataFrame({'estimate': point, # point estimate\n",
-        "                         'stderr': stderr, # standard error\n",
-        "                         'lower': point - 1.96*stderr, # lower end of 95% confidence interval\n",
-        "                         'upper': point + 1.96*stderr, # upper end of 95% confidence interval\n",
-        "                         'rmse y': np.sqrt(np.mean(resy**2)), # RMSE of model that predicts outcome y\n",
-        "                         'rmse Z': np.sqrt(np.mean(resZ**2)), # RMSE of model that predicts instrument Z\n",
-        "                         'rmse D': np.sqrt(np.mean(resD**2)) # RMSE of model that predicts treatment D\n",
-        "                         }, index=[name])"
-      ],
-      "metadata": {
-        "id": "F6U6fnzne-77"
-      },
-      "execution_count": null,
-      "outputs": []
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {
-        "papermill": {
-          "duration": 0.011698,
-          "end_time": "2021-04-23T10:42:10.482689",
-          "exception": false,
-          "start_time": "2021-04-23T10:42:10.470991",
-          "status": "completed"
-        },
-        "tags": [],
-        "id": "x1g3XjsIzcL_"
-      },
-      "source": [
-        "-----\n",
-        "\n",
-        "\n",
-        "# Emprical Example:  Acemoglu, Johnson, Robinson (AER).\n",
-        "\n",
-        "* Y is log GDP;\n",
-        "* D is a measure of Protection from Expropriation, a proxy for\n",
-        "quality of insitutions;\n",
-        "* Z is the log of Settler's mortality;\n",
-        "* W are geographical variables (latitude, latitude squared, continent dummies as well as interactions)\n",
-        "\n"
-      ]
-    },
-    {
-      "cell_type": "code",
-      "source": [
-        "file = \"https://raw.githubusercontent.com/CausalAIBook/MetricsMLNotebooks/main/data/AJR.csv\"\n",
-        "data = pd.read_csv(file)\n",
-        "data.shape"
-      ],
-      "metadata": {
-        "id": "0Pc6OCp24rji"
-      },
-      "execution_count": null,
-      "outputs": []
-    },
-    {
-      "cell_type": "markdown",
-      "source": [
-        "Run the following command to install hdmpy for rigorous lasso:\n",
-        "\n",
-        "``` !pip install multiprocess ```\n",
-        "\n",
-        "\n",
-        "```!git clone https://github.com/maxhuppertz/hdmpy.git ```"
-      ],
-      "metadata": {
-        "id": "3rGw2hAqSx2Y"
-      }
-    },
-    {
-      "cell_type": "code",
-      "source": [
-        "!pip install multiprocess\n",
-        "!git clone https://github.com/maxhuppertz/hdmpy.git"
-      ],
-      "metadata": {
-        "id": "O5Rknnz2Rh9O"
-      },
-      "execution_count": null,
-      "outputs": []
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "qjaFmAyEzcLz",
+    "papermill": {
+     "duration": 0.008106,
+     "end_time": "2021-04-23T10:41:33.911944",
+     "exception": false,
+     "start_time": "2021-04-23T10:41:33.903838",
+     "status": "completed"
     },
-    {
-      "cell_type": "code",
-      "source": [
-        "import hdmpy\n",
-        "class RLasso(BaseEstimator):\n",
-        "\n",
-        "    def __init__(self, *, post=True):\n",
-        "        self.post = post\n",
-        "\n",
-        "    def fit(self, X, y):\n",
-        "        self.rlasso_ = hdmpy.rlasso(X, y, post=self.post)\n",
-        "        return self\n",
-        "\n",
-        "    def predict(self, X):\n",
-        "        return np.array(X) @ np.array(self.rlasso_.est['beta']).flatten() + np.array(self.rlasso_.est['intercept'])"
-      ],
-      "metadata": {
-        "id": "Kb9N0c5uS47n"
-      },
-      "execution_count": null,
-      "outputs": []
+    "tags": []
+   },
+   "source": [
+    "# Double/Debiased ML for Partially Linear IV Model\n",
+    "\n",
+    "This is a simple implementation of Debiased Machine Learning for the Partially Linear\n",
+    "IV Regression Model, which provides an application of DML IV inference.\n",
+    "\n",
+    "\n",
+    "Reference:\n",
+    "\n",
+    "- https://arxiv.org/abs/1608.00060\n",
+    "- https://www.amazon.com/Business-Data-Science-Combining-Accelerate/dp/1260452778\n",
+    "\n",
+    "The code is based on the book.\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "G3YNyr-ezcL4",
+    "papermill": {
+     "duration": 0.006963,
+     "end_time": "2021-04-23T10:41:33.926085",
+     "exception": false,
+     "start_time": "2021-04-23T10:41:33.919122",
+     "status": "completed"
     },
-    {
-      "cell_type": "code",
-      "source": [
-        "y = data['GDP']\n",
-        "d = data['Exprop']\n",
-        "z = data['logMort']\n",
-        "\n",
-        "controls = data.drop(['GDP', 'Exprop', 'logMort'], axis=1)\n",
-        "x = patsy.dmatrix('-1 + (Latitude + Latitude2 + Africa + Asia + Namer + Samer)**2', data=controls, return_type='dataframe')"
-      ],
-      "metadata": {
-        "id": "nO6a6tJWS-FO"
-      },
-      "execution_count": null,
-      "outputs": []
+    "tags": []
+   },
+   "source": [
+    "\n",
+    "# Partially Linear IV Model\n",
+    "\n",
+    "We consider the partially linear structural equation model:\n",
+    "\\begin{eqnarray}\n",
+    " &  Y - D\\theta_0 = g_0(X) + \\zeta,  & E[\\zeta \\mid Z,X]= 0,\\\\\n",
+    "  & Z = m_0(X) +  V,   &  E[V \\mid X] = 0.\n",
+    "\\end{eqnarray}\n",
+    "\n",
+    "\n",
+    "Note that this model is not a regression model unless $Z=D$.  The model  is a canonical\n",
+    "model in causal inference, going back to P. Wright's work on IV methods for estimating demand/supply equations, with the modern difference being that $g_0$ and $m_0$ are nonlinear, potentially complicated functions of high-dimensional $X$.  \n",
+    "\n",
+    "\n",
+    "The idea of this model is that there is a structural or causal relation between $Y$ and $D$, captured by $\\theta_0$, and $g_0(X) + \\zeta$ is the stochastic error, partly explained by covariates $X$.  $V$ and $\\zeta$ are stochastic errors that are not explained by $X$. Since $Y$ and $D$ are jointly determined, we need an external factor, commonly referred to as an instrument, $Z$ to create exogenous variation\n",
+    "in $D$.   Note that $Z$ should affect $D$.  The $X$ here serve again as confounding factors, so we can think of variation in $Z$ as being exogenous only conditional on $X$.\n",
+    "\n",
+    "\n",
+    "The causal DAG this model corresponds to is given by:\n",
+    "$$\n",
+    "Z \\to D,  X \\to (Y, Z, D),  L \\to (Y,D),\n",
+    "$$\n",
+    "where $L$ is the latent confounder affecting both $Y$ and $D$, but not $Z$.\n",
+    "\n",
+    "\n",
+    "\n",
+    "---\n",
+    "\n",
+    "# Example\n",
+    "\n",
+    "A simple contextual example is from biostatistics, where $Y$ is a health outcome and $D$ is indicator of smoking.  Thus, $\\theta_0$ is captures the effect of smoking on health.  Health outcome $Y$ and smoking behavior $D$ are treated as being jointly determined.  $X$ represents patient characteristics, and $Z$ could be a doctor's advice not to smoke (or another behavioral treatment) that may affect the outcome $Y$ only through shifting the behavior $D$, conditional on characteristics $X$.   \n",
+    "\n",
+    "----\n",
+    "\n",
+    "\n",
+    "\n",
+    "# PLIVM in Residualized Form\n",
+    "\n",
+    "The PLIV model above can be rewritten in the following residualized form:\n",
+    "$$\n",
+    "  \\tilde Y = \\tilde D \\theta_0 + \\zeta,   \\quad  E[\\zeta \\mid V,X]= 0,\n",
+    "$$\n",
+    "where\n",
+    "$$\n",
+    " \\tilde Y = (Y- \\ell_0(X)),  \\quad \\ell_0(X) = E[Y \\mid X] \\\\\n",
+    "   \\tilde D = (D - r_0(X)), \\quad r_0(X) = E[D \\mid X] \\\\\n",
+    "   \\tilde Z = (Z- m_0(X)), \\quad m_0(X) = E[Z \\mid X].\n",
+    "$$\n",
+    "   The \"tilde\" variables (e.g. $\\tilde Y$) above represent original variables after taking out or \"partialling out\"\n",
+    "  the effect of $X$.  Note that $\\theta_0$ is identified from this equation if $V$\n",
+    "  and $U$ have non-zero correlation, which automatically means that $U$ and $V$\n",
+    "  must have non-zero variation.\n",
+    "\n",
+    "  \n",
+    "\n",
+    "-----\n",
+    "\n",
+    "# DML for PLIV Model\n",
+    "\n",
+    "Given identification, DML  proceeds as follows\n",
+    "\n",
+    "Compute the estimates $\\hat \\ell_0$, $\\hat r_0$, and $\\hat m_0$ , which amounts\n",
+    "to solving the three problems of predicting $Y$, $D$, and $Z$ using\n",
+    "$X$, using any generic  ML method, giving us estimated residuals\n",
+    "$$\n",
+    "\\tilde Y = Y - \\hat \\ell_0(X), \\\\ \\tilde D= D - \\hat r_0(X), \\\\ \\tilde Z = Z- \\hat m_0(X).\n",
+    "$$\n",
+    "The estimates should be of a cross-validated form, as detailed in the algorithm below.\n",
+    "\n",
+    "Estimate $\\theta_0$ by the the intstrumental\n",
+    "variable regression of $\\tilde Y$ on $\\tilde D$ using $\\tilde Z$ as an instrument.\n",
+    "Use the conventional inference for the IV regression estimator, ignoring\n",
+    "the estimation error in these residuals.\n",
+    "\n",
+    "The reason we work with this residualized form is that it eliminates the bias\n",
+    "arising when solving the prediction problem in stage 1. The role of cross-validation\n",
+    "is to avoid another source of bias due to potential overfitting.\n",
+    "\n",
+    "The estimator is adaptive,\n",
+    "in the sense that the first stage estimation errors do not affect the second\n",
+    "stage errors.\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "_kg_hide-output": true,
+    "execution": {
+     "iopub.execute_input": "2021-04-23T10:41:33.973149Z",
+     "iopub.status.busy": "2021-04-23T10:41:33.971090Z",
+     "iopub.status.idle": "2021-04-23T10:42:08.602961Z",
+     "shell.execute_reply": "2021-04-23T10:42:08.601638Z"
     },
-    {
-      "cell_type": "code",
-      "source": [
-        "# DML with Post-Lasso:\n",
-        "modely = make_pipeline(StandardScaler(), RLasso(post=True))\n",
-        "modeld = make_pipeline(StandardScaler(), RLasso(post=True))\n",
-        "modelz = make_pipeline(StandardScaler(), RLasso(post=True))\n",
-        "\n",
-        "# Run DML model with nfolds folds of cross-fitting\n",
-        "res_rlasso = dml(x, z, d, y, modely, modeld, modelz, nfolds = 5)\n",
-        "table_rlasso = summary(*res_rlasso, x, d, z, y, name = 'RLasso')"
-      ],
-      "metadata": {
-        "id": "kpPe250uelE7"
-      },
-      "execution_count": null,
-      "outputs": []
+    "id": "yGW6JhG5zcL5",
+    "papermill": {
+     "duration": 34.670095,
+     "end_time": "2021-04-23T10:42:08.603197",
+     "exception": false,
+     "start_time": "2021-04-23T10:41:33.933102",
+     "status": "completed"
     },
-    {
-      "cell_type": "code",
-      "source": [
-        "# DML with Random Forests. RFs don't require scaling but we do it for consistency\n",
-        "modely = make_pipeline(StandardScaler(), RandomForestRegressor(n_estimators=100, min_samples_leaf=5, random_state=1234))\n",
-        "modeld = make_pipeline(StandardScaler(), RandomForestRegressor(n_estimators=100, min_samples_leaf=5, random_state=1234))\n",
-        "modelz = make_pipeline(StandardScaler(), RandomForestRegressor(n_estimators=100, min_samples_leaf=5, random_state=1234))\n",
-        "\n",
-        "# Run DML model with nfolds folds of cross-fitting (computationally intensive)\n",
-        "res_RF = dml(x, z, d, y, modely, modeld, modelz, nfolds = 5)\n",
-        "table_RF = summary(*res_RF, x, d, z, y, name = 'RF')"
-      ],
-      "metadata": {
-        "id": "pdKuGFpXglp8"
-      },
-      "execution_count": null,
-      "outputs": []
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "# Import relevant packages\n",
+    "import pandas as pd\n",
+    "import numpy as np\n",
+    "import matplotlib.pyplot as plt\n",
+    "from sklearn.model_selection import cross_val_predict, KFold\n",
+    "from sklearn.pipeline import make_pipeline\n",
+    "from sklearn.preprocessing import StandardScaler\n",
+    "from sklearn.ensemble import RandomForestRegressor\n",
+    "import patsy\n",
+    "import warnings\n",
+    "from sklearn.base import BaseEstimator\n",
+    "import statsmodels.api as sm\n",
+    "from statsmodels.tools import add_constant\n",
+    "warnings.simplefilter('ignore')\n",
+    "np.random.seed(1234)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "execution": {
+     "iopub.execute_input": "2021-04-23T10:42:08.664371Z",
+     "iopub.status.busy": "2021-04-23T10:42:08.629661Z",
+     "iopub.status.idle": "2021-04-23T10:42:10.458175Z",
+     "shell.execute_reply": "2021-04-23T10:42:10.456976Z"
     },
-    {
-      "cell_type": "code",
-      "source": [
-        "results = pd.concat([table_rlasso, table_RF], axis=0)\n",
-        "results"
-      ],
-      "metadata": {
-        "id": "ijo8p0Moig_F"
-      },
-      "execution_count": null,
-      "outputs": []
+    "id": "j2WVUbBDzcL-",
+    "papermill": {
+     "duration": 1.846109,
+     "end_time": "2021-04-23T10:42:10.458406",
+     "exception": false,
+     "start_time": "2021-04-23T10:42:08.612297",
+     "status": "completed"
     },
-    {
-      "cell_type": "markdown",
-      "metadata": {
-        "papermill": {
-          "duration": 0.013703,
-          "end_time": "2021-04-23T10:42:16.269251",
-          "exception": false,
-          "start_time": "2021-04-23T10:42:16.255548",
-          "status": "completed"
-        },
-        "tags": [],
-        "id": "r7l6cFi2zcMA"
-      },
-      "source": [
-        "# Weak Instruments?"
-      ]
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "def dml(X, Z, D, y, modely, modeld, modelz, *, nfolds, classifier=False):\n",
+    "    '''\n",
+    "    DML for the Partially Linear Model setting with cross-fitting\n",
+    "\n",
+    "    Input\n",
+    "    -----\n",
+    "    X: the controls\n",
+    "    Z: the instrument\n",
+    "    D: the treatment\n",
+    "    y: the outcome\n",
+    "    modely: the ML model for predicting the outcome y\n",
+    "    modeld: the ML model for predicting the treatment D\n",
+    "    modelz: the ML model for predicting the instrument Z\n",
+    "    nfolds: the number of folds in cross-fitting\n",
+    "    classifier: bool, whether the modeld is a classifier or a regressor\n",
+    "\n",
+    "    Output\n",
+    "    ------\n",
+    "    point: the point estimate of the treatment effect of D on y\n",
+    "    stderr: the standard error of the treatment effect\n",
+    "    yhat: the cross-fitted predictions for the outcome y\n",
+    "    Dhat: the cross-fitted predictions for the treatment D\n",
+    "    Zhat: the cross-fitted predictions for the instrument Z\n",
+    "    resy: the outcome residuals\n",
+    "    resD: the treatment residuals\n",
+    "    resZ: the instrument residuals\n",
+    "    epsilon: the final residual-on-residual OLS regression residual\n",
+    "    '''\n",
+    "    cv = KFold(n_splits=nfolds, shuffle=True, random_state=123)  # shuffled k-folds\n",
+    "    yhat = cross_val_predict(modely, X, y, cv=cv, n_jobs=-1)  # out-of-fold predictions for y\n",
+    "    # out-of-fold predictions for D\n",
+    "    # use predict or predict_proba dependent on classifier or regressor for D\n",
+    "    if classifier:\n",
+    "        Dhat = cross_val_predict(modeld, X, D, cv=cv, method='predict_proba', n_jobs=-1)[:, 1]\n",
+    "        Zhat = cross_val_predict(modelz, X, Z, cv=cv, method='predict_proba', n_jobs=-1)[:, 1]\n",
+    "    else:\n",
+    "        Dhat = cross_val_predict(modeld, X, D, cv=cv, n_jobs=-1)\n",
+    "        Zhat = cross_val_predict(modelz, X, Z, cv=cv, n_jobs=-1)\n",
+    "    # calculate outcome and treatment residuals\n",
+    "    resy = y - yhat\n",
+    "    resD = D - Dhat\n",
+    "    resZ = Z - Zhat\n",
+    "    # rename for downstream tasks\n",
+    "    resy = resy.rename(\"resy\")\n",
+    "    resD = resD.rename(\"resD\")\n",
+    "    resZ = resZ.rename(\"resZ\")\n",
+    "\n",
+    "    # final stage ols based point estimate and standard error\n",
+    "    point = np.mean(resy * resZ) / np.mean(resD * resZ)\n",
+    "    epsilon = resy - point * resD\n",
+    "    var = np.mean(epsilon**2 * resZ**2) / np.mean(resD * resZ)**2\n",
+    "    stderr = np.sqrt(var / X.shape[0])\n",
+    "    return point, stderr, yhat, Dhat, Zhat, resy, resD, resZ, epsilon"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "id": "F6U6fnzne-77"
+   },
+   "outputs": [],
+   "source": [
+    "def summary(point, stderr, yhat, Dhat, Zhat, resy, resD, resZ, epsilon, X, D, Z, y, *, name):\n",
+    "    '''\n",
+    "    Convenience summary function that takes the results of the DML IV function\n",
+    "    and summarizes several estimation quantities and performance metrics.\n",
+    "    '''\n",
+    "    return pd.DataFrame({'estimate': point,  # point estimate\n",
+    "                         'stderr': stderr,  # standard error\n",
+    "                         'lower': point - 1.96 * stderr,  # lower end of 95% confidence interval\n",
+    "                         'upper': point + 1.96 * stderr,  # upper end of 95% confidence interval\n",
+    "                         'rmse y': np.sqrt(np.mean(resy**2)),  # RMSE of model that predicts outcome y\n",
+    "                         'rmse Z': np.sqrt(np.mean(resZ**2)),  # RMSE of model that predicts instrument Z\n",
+    "                         'rmse D': np.sqrt(np.mean(resD**2))  # RMSE of model that predicts treatment D\n",
+    "                         }, index=[name])"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "x1g3XjsIzcL_",
+    "papermill": {
+     "duration": 0.011698,
+     "end_time": "2021-04-23T10:42:10.482689",
+     "exception": false,
+     "start_time": "2021-04-23T10:42:10.470991",
+     "status": "completed"
     },
-    {
-      "cell_type": "code",
-      "execution_count": null,
-      "metadata": {
-        "execution": {
-          "iopub.execute_input": "2021-04-23T10:42:16.301968Z",
-          "iopub.status.busy": "2021-04-23T10:42:16.300698Z",
-          "iopub.status.idle": "2021-04-23T10:42:32.390351Z",
-          "shell.execute_reply": "2021-04-23T10:42:32.388488Z"
-        },
-        "papermill": {
-          "duration": 16.107321,
-          "end_time": "2021-04-23T10:42:32.390552",
-          "exception": false,
-          "start_time": "2021-04-23T10:42:16.283231",
-          "status": "completed"
-        },
-        "tags": [],
-        "id": "rsUnPDfpzcMB"
-      },
-      "outputs": [],
-      "source": [
-        "# Summary for RLasso\n",
-        "resy_rlasso, resD_rlasso, resZ_rlasso = res_rlasso[5], res_rlasso[6], res_rlasso[7]\n",
-        "regDZ_rlasso = sm.OLS(resD_rlasso, add_constant(resZ_rlasso)).fit(cov_type='HC3', use_t=True)  # Using HC3 robust standard errors\n",
-        "print(regDZ_rlasso.summary())\n",
-        "\n",
-        "# Summary for RF\n",
-        "resy_RF, resD_RF, resZ_RF = res_RF[5], res_RF[6],res_RF[7]\n",
-        "regDZ_RF = sm.OLS(resD_RF, add_constant(resZ_RF)).fit(cov_type='HC3', use_t=True)  # Using HC3 robust standard errors\n",
-        "print(regDZ_RF.summary())"
-      ]
+    "tags": []
+   },
+   "source": [
+    "-----\n",
+    "\n",
+    "\n",
+    "# Emprical Example:  Acemoglu, Johnson, Robinson (AER).\n",
+    "\n",
+    "* Y is log GDP;\n",
+    "* D is a measure of Protection from Expropriation, a proxy for\n",
+    "quality of insitutions;\n",
+    "* Z is the log of Settler's mortality;\n",
+    "* W are geographical variables (latitude, latitude squared, continent dummies as well as interactions)\n",
+    "\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "id": "0Pc6OCp24rji"
+   },
+   "outputs": [],
+   "source": [
+    "file = \"https://raw.githubusercontent.com/CausalAIBook/MetricsMLNotebooks/main/data/AJR.csv\"\n",
+    "data = pd.read_csv(file)\n",
+    "data.shape"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "3rGw2hAqSx2Y"
+   },
+   "source": [
+    "Run the following command to install hdmpy for rigorous lasso:\n",
+    "\n",
+    "``` !pip install multiprocess ```\n",
+    "\n",
+    "\n",
+    "```!git clone https://github.com/maxhuppertz/hdmpy.git ```"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "id": "O5Rknnz2Rh9O"
+   },
+   "outputs": [],
+   "source": [
+    "!pip install multiprocess\n",
+    "!git clone https://github.com/maxhuppertz/hdmpy.git"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "id": "Kb9N0c5uS47n"
+   },
+   "outputs": [],
+   "source": [
+    "import hdmpy\n",
+    "\n",
+    "\n",
+    "class RLasso(BaseEstimator):\n",
+    "\n",
+    "    def __init__(self, *, post=True):\n",
+    "        self.post = post\n",
+    "\n",
+    "    def fit(self, X, y):\n",
+    "        self.rlasso_ = hdmpy.rlasso(X, y, post=self.post)\n",
+    "        return self\n",
+    "\n",
+    "    def predict(self, X):\n",
+    "        return np.array(X) @ np.array(self.rlasso_.est['beta']).flatten() + np.array(self.rlasso_.est['intercept'])"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "id": "nO6a6tJWS-FO"
+   },
+   "outputs": [],
+   "source": [
+    "y = data['GDP']\n",
+    "d = data['Exprop']\n",
+    "z = data['logMort']\n",
+    "\n",
+    "controls = data.drop(['GDP', 'Exprop', 'logMort'], axis=1)\n",
+    "x = patsy.dmatrix('-1 + (Latitude + Latitude2 + Africa + Asia + Namer + Samer)**2',\n",
+    "                  data=controls, return_type='dataframe')"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "id": "kpPe250uelE7"
+   },
+   "outputs": [],
+   "source": [
+    "# DML with Post-Lasso:\n",
+    "modely = make_pipeline(StandardScaler(), RLasso(post=True))\n",
+    "modeld = make_pipeline(StandardScaler(), RLasso(post=True))\n",
+    "modelz = make_pipeline(StandardScaler(), RLasso(post=True))\n",
+    "\n",
+    "# Run DML model with nfolds folds of cross-fitting\n",
+    "res_rlasso = dml(x, z, d, y, modely, modeld, modelz, nfolds=5)\n",
+    "table_rlasso = summary(*res_rlasso, x, d, z, y, name='RLasso')"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "id": "pdKuGFpXglp8"
+   },
+   "outputs": [],
+   "source": [
+    "# DML with Random Forests. RFs don't require scaling but we do it for consistency\n",
+    "modely = make_pipeline(StandardScaler(),\n",
+    "                       RandomForestRegressor(n_estimators=100, min_samples_leaf=5, random_state=1234))\n",
+    "modeld = make_pipeline(StandardScaler(),\n",
+    "                       RandomForestRegressor(n_estimators=100, min_samples_leaf=5, random_state=1234))\n",
+    "modelz = make_pipeline(StandardScaler(),\n",
+    "                       RandomForestRegressor(n_estimators=100, min_samples_leaf=5, random_state=1234))\n",
+    "\n",
+    "# Run DML model with nfolds folds of cross-fitting (computationally intensive)\n",
+    "res_RF = dml(x, z, d, y, modely, modeld, modelz, nfolds=5)\n",
+    "table_RF = summary(*res_RF, x, d, z, y, name='RF')"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "id": "ijo8p0Moig_F"
+   },
+   "outputs": [],
+   "source": [
+    "results = pd.concat([table_rlasso, table_RF], axis=0)\n",
+    "results"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "r7l6cFi2zcMA",
+    "papermill": {
+     "duration": 0.013703,
+     "end_time": "2021-04-23T10:42:16.269251",
+     "exception": false,
+     "start_time": "2021-04-23T10:42:16.255548",
+     "status": "completed"
     },
-    {
-      "cell_type": "markdown",
-      "metadata": {
-        "papermill": {
-          "duration": 0.015919,
-          "end_time": "2021-04-23T10:42:32.423865",
-          "exception": false,
-          "start_time": "2021-04-23T10:42:32.407946",
-          "status": "completed"
-        },
-        "tags": [],
-        "id": "L3fDDOmfzcMD"
-      },
-      "source": [
-        "## Anderson-Rubin Idea for Inference with Weak Instruments\n",
-        "\n",
-        "As shown above, we may have weak instruments because the t-stat in the regression $\\tilde D \\sim \\tilde Z$ is small relative to standard rules of thumb -- for example Stock and Yogo (2005) suggest accounting for weak instruments if the first stage F-statistic is less than 10 (and more recent work suggests even larger cutoffs).\n",
-        "\n",
-        "\n",
-        " Here, we consider one specific approach (from Anderson and Rubin (1949)) to doing valid inference under weak identification based upon the statistic:\n",
-        "$$\n",
-        "C(\\theta) = \\frac{ |E_n [(\\tilde Y -  \\theta  \\tilde D) \\tilde Z]|^2}{ \\mathbb{V}_n [(\\tilde Y -  \\theta \\tilde  D) \\tilde Z ]/n}.\n",
-        "$$\n",
-        "The empirical variance $\\mathbb{V}_n$ is defined as:\n",
-        "\\begin{align*}\n",
-        "\\mathbb{V}_{n}[ g(W_i)] &:=   \\mathbb{E}_{n}g(W_i)g(W_i)' - \\mathbb{E}_{n}[ g(W_i)]\\mathbb{E}_{n}[ g(W_i)]'.\n",
-        "\\end{align*}\n",
-        "\n",
-        "If $\\theta_0 = \\theta$, then $C(\\theta) \\overset{a}{\\sim} N(0,1)^2 = \\chi^2(1)$. Therefore,  we can reject the hypothesis $\\theta_0 = \\theta$ at level $a$ (for example $a = .05$ for a 5\\% level test)  if $C(\\theta)> c(1-a)$ where $c(1-a)$ is the $(1-a)$- quantile of a $\\chi^2(1)$ variable.  The probability of falsely rejecting the true hypothesis is approximately $a \\times 100\\%$.  \n",
-        "To construct a $(1-a) \\times 100$\\% confidence region for $\\theta$, we can then simply invert the test by collecting all parameter values that are not rejected at the $a$ level:\n",
-        "$$\n",
-        "CR(\\theta) = \\{ \\theta \\in \\Theta: C(\\theta)  \\leq c(1-a)\\}.\n",
-        "$$\n",
-        "\n",
-        "\n",
-        "In more complex settings with many controls or controls that enter with unknown functional form, we can simply replace the residuals $\\tilde Y$, $\\tilde D$, and $\\tilde Z$ by machine\n",
-        "learned cross-fitted residuals $\\check Y$, $\\check D$, and $ \\check Z$.  Thanks to the orthogonality of the IV moment condition underlying the formulation outlined above, we can formally assert that the properties of $C(\\theta)$ and the subsequent testing procedure and confidence region for $\\theta$ continue to hold when using cross-fitted residuals. We will further be able to apply the general procedure to cases where $D$\n",
-        "is a vector, with a suitable adjustment of the statistic $C(\\theta)$.\n"
-      ]
+    "tags": []
+   },
+   "source": [
+    "# Weak Instruments?"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "execution": {
+     "iopub.execute_input": "2021-04-23T10:42:16.301968Z",
+     "iopub.status.busy": "2021-04-23T10:42:16.300698Z",
+     "iopub.status.idle": "2021-04-23T10:42:32.390351Z",
+     "shell.execute_reply": "2021-04-23T10:42:32.388488Z"
     },
-    {
-      "cell_type": "markdown",
-      "metadata": {
-        "papermill": {
-          "duration": 0.015943,
-          "end_time": "2021-04-23T10:42:32.455719",
-          "exception": false,
-          "start_time": "2021-04-23T10:42:32.439776",
-          "status": "completed"
-        },
-        "tags": [],
-        "id": "PG2tPuFzzcMD"
-      },
-      "source": [
-        "So let's carry out DML inference combined with Anderson-Rubin Idea"
-      ]
+    "id": "rsUnPDfpzcMB",
+    "papermill": {
+     "duration": 16.107321,
+     "end_time": "2021-04-23T10:42:32.390552",
+     "exception": false,
+     "start_time": "2021-04-23T10:42:16.283231",
+     "status": "completed"
     },
-    {
-      "cell_type": "code",
-      "source": [
-        "from scipy.stats import chi2\n",
-        "\n",
-        "def DML_AR_PLIV(rY, rD, rZ, grid, alpha=0.05):\n",
-        "    n = len(rY)\n",
-        "    Cstat = np.zeros(len(grid))\n",
-        "\n",
-        "    for i in range(len(grid)):\n",
-        "        term1 = (rY - grid[i] * rD) * rZ\n",
-        "        Cstat[i] = n * (np.mean(term1))**2 / np.var(term1)\n",
-        "\n",
-        "    Cstat = np.nan_to_num(Cstat)  # Replace any NaNs with zeros\n",
-        "\n",
-        "    lower_bound = min(grid[Cstat < chi2.ppf(1 - alpha, 1)])\n",
-        "    upper_bound = max(grid[Cstat < chi2.ppf(1 - alpha, 1)])\n",
-        "\n",
-        "    plt.plot(grid, Cstat, linestyle='-', color='black')\n",
-        "    plt.axhline(y=chi2.ppf(1 - alpha, 1), linestyle='--', color='blue')\n",
-        "    plt.axvline(x=lower_bound, linestyle='--', color='red')\n",
-        "    plt.axvline(x=upper_bound, linestyle='--', color='red')\n",
-        "    plt.xlabel('Effect of institutions')\n",
-        "    plt.ylabel('Statistic')\n",
-        "    plt.title('')\n",
-        "    plt.show()\n",
-        "\n",
-        "    return {'UB': upper_bound, 'LB': lower_bound}"
-      ],
-      "metadata": {
-        "id": "6weyaIdJ3pWs"
-      },
-      "execution_count": null,
-      "outputs": []
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "# Summary for RLasso\n",
+    "resy_rlasso, resD_rlasso, resZ_rlasso = res_rlasso[5], res_rlasso[6], res_rlasso[7]\n",
+    "# Using HC3 robust standard errors\n",
+    "regDZ_rlasso = sm.OLS(resD_rlasso, add_constant(resZ_rlasso)).fit(cov_type='HC3', use_t=True)\n",
+    "print(regDZ_rlasso.summary())\n",
+    "\n",
+    "# Summary for RF\n",
+    "resy_RF, resD_RF, resZ_RF = res_RF[5], res_RF[6], res_RF[7]\n",
+    "# Using HC3 robust standard errors\n",
+    "regDZ_RF = sm.OLS(resD_RF, add_constant(resZ_RF)).fit(cov_type='HC3', use_t=True)\n",
+    "print(regDZ_RF.summary())"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "L3fDDOmfzcMD",
+    "papermill": {
+     "duration": 0.015919,
+     "end_time": "2021-04-23T10:42:32.423865",
+     "exception": false,
+     "start_time": "2021-04-23T10:42:32.407946",
+     "status": "completed"
     },
-    {
-      "cell_type": "code",
-      "execution_count": null,
-      "metadata": {
-        "execution": {
-          "iopub.execute_input": "2021-04-23T10:42:32.545479Z",
-          "iopub.status.busy": "2021-04-23T10:42:32.543976Z",
-          "iopub.status.idle": "2021-04-23T10:42:33.002321Z",
-          "shell.execute_reply": "2021-04-23T10:42:33.001039Z"
-        },
-        "papermill": {
-          "duration": 0.478528,
-          "end_time": "2021-04-23T10:42:33.002469",
-          "exception": false,
-          "start_time": "2021-04-23T10:42:32.523941",
-          "status": "completed"
-        },
-        "tags": [],
-        "id": "k9bB2O13zcME"
-      },
-      "outputs": [],
-      "source": [
-        "# Using RLasso\n",
-        "dml_ar_rlasso = DML_AR_PLIV(rY=resy_rlasso, rD=resD_rlasso, rZ=resZ_rlasso, grid=np.arange(-2, 2, 0.01))\n",
-        "print(\"RLasso UB and LB: \", dml_ar_rlasso)\n",
-        "# Using RF\n",
-        "dml_ar_RF = DML_AR_PLIV(rY=resy_RF, rD=resD_RF, rZ=resZ_RF, grid=np.arange(-2, 2, 0.01))\n",
-        "print(\"RF UB and LB: \", dml_ar_RF)"
-      ]
-    }
-  ],
-  "metadata": {
-    "kernelspec": {
-      "display_name": "Python 3",
-      "name": "python3"
+    "tags": []
+   },
+   "source": [
+    "## Anderson-Rubin Idea for Inference with Weak Instruments\n",
+    "\n",
+    "As shown above, we may have weak instruments because the t-stat in the regression $\\tilde D \\sim \\tilde Z$ is small relative to standard rules of thumb -- for example Stock and Yogo (2005) suggest accounting for weak instruments if the first stage F-statistic is less than 10 (and more recent work suggests even larger cutoffs).\n",
+    "\n",
+    "\n",
+    " Here, we consider one specific approach (from Anderson and Rubin (1949)) to doing valid inference under weak identification based upon the statistic:\n",
+    "$$\n",
+    "C(\\theta) = \\frac{ |E_n [(\\tilde Y -  \\theta  \\tilde D) \\tilde Z]|^2}{ \\mathbb{V}_n [(\\tilde Y -  \\theta \\tilde  D) \\tilde Z ]/n}.\n",
+    "$$\n",
+    "The empirical variance $\\mathbb{V}_n$ is defined as:\n",
+    "\\begin{align*}\n",
+    "\\mathbb{V}_{n}[ g(W_i)] &:=   \\mathbb{E}_{n}g(W_i)g(W_i)' - \\mathbb{E}_{n}[ g(W_i)]\\mathbb{E}_{n}[ g(W_i)]'.\n",
+    "\\end{align*}\n",
+    "\n",
+    "If $\\theta_0 = \\theta$, then $C(\\theta) \\overset{a}{\\sim} N(0,1)^2 = \\chi^2(1)$. Therefore,  we can reject the hypothesis $\\theta_0 = \\theta$ at level $a$ (for example $a = .05$ for a 5\\% level test)  if $C(\\theta)> c(1-a)$ where $c(1-a)$ is the $(1-a)$- quantile of a $\\chi^2(1)$ variable.  The probability of falsely rejecting the true hypothesis is approximately $a \\times 100\\%$.  \n",
+    "To construct a $(1-a) \\times 100$\\% confidence region for $\\theta$, we can then simply invert the test by collecting all parameter values that are not rejected at the $a$ level:\n",
+    "$$\n",
+    "CR(\\theta) = \\{ \\theta \\in \\Theta: C(\\theta)  \\leq c(1-a)\\}.\n",
+    "$$\n",
+    "\n",
+    "\n",
+    "In more complex settings with many controls or controls that enter with unknown functional form, we can simply replace the residuals $\\tilde Y$, $\\tilde D$, and $\\tilde Z$ by machine\n",
+    "learned cross-fitted residuals $\\check Y$, $\\check D$, and $ \\check Z$.  Thanks to the orthogonality of the IV moment condition underlying the formulation outlined above, we can formally assert that the properties of $C(\\theta)$ and the subsequent testing procedure and confidence region for $\\theta$ continue to hold when using cross-fitted residuals. We will further be able to apply the general procedure to cases where $D$\n",
+    "is a vector, with a suitable adjustment of the statistic $C(\\theta)$.\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "PG2tPuFzzcMD",
+    "papermill": {
+     "duration": 0.015943,
+     "end_time": "2021-04-23T10:42:32.455719",
+     "exception": false,
+     "start_time": "2021-04-23T10:42:32.439776",
+     "status": "completed"
     },
-    "language_info": {
-      "name": "python"
+    "tags": []
+   },
+   "source": [
+    "So let's carry out DML inference combined with Anderson-Rubin Idea"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "id": "6weyaIdJ3pWs"
+   },
+   "outputs": [],
+   "source": [
+    "from scipy.stats import chi2\n",
+    "\n",
+    "\n",
+    "def DML_AR_PLIV(rY, rD, rZ, grid, alpha=0.05):\n",
+    "    n = len(rY)\n",
+    "    Cstat = np.zeros(len(grid))\n",
+    "\n",
+    "    for i in range(len(grid)):\n",
+    "        term1 = (rY - grid[i] * rD) * rZ\n",
+    "        Cstat[i] = n * (np.mean(term1))**2 / np.var(term1)\n",
+    "\n",
+    "    Cstat = np.nan_to_num(Cstat)  # Replace any NaNs with zeros\n",
+    "\n",
+    "    lower_bound = min(grid[Cstat < chi2.ppf(1 - alpha, 1)])\n",
+    "    upper_bound = max(grid[Cstat < chi2.ppf(1 - alpha, 1)])\n",
+    "\n",
+    "    plt.plot(grid, Cstat, linestyle='-', color='black')\n",
+    "    plt.axhline(y=chi2.ppf(1 - alpha, 1), linestyle='--', color='blue')\n",
+    "    plt.axvline(x=lower_bound, linestyle='--', color='red')\n",
+    "    plt.axvline(x=upper_bound, linestyle='--', color='red')\n",
+    "    plt.xlabel('Effect of institutions')\n",
+    "    plt.ylabel('Statistic')\n",
+    "    plt.title('')\n",
+    "    plt.show()\n",
+    "\n",
+    "    return {'UB': upper_bound, 'LB': lower_bound}"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "execution": {
+     "iopub.execute_input": "2021-04-23T10:42:32.545479Z",
+     "iopub.status.busy": "2021-04-23T10:42:32.543976Z",
+     "iopub.status.idle": "2021-04-23T10:42:33.002321Z",
+     "shell.execute_reply": "2021-04-23T10:42:33.001039Z"
     },
+    "id": "k9bB2O13zcME",
     "papermill": {
-      "duration": 62.79319,
-      "end_time": "2021-04-23T10:42:34.038582",
-      "environment_variables": {},
-      "exception": null,
-      "input_path": "__notebook__.ipynb",
-      "output_path": "__notebook__.ipynb",
-      "parameters": {},
-      "start_time": "2021-04-23T10:41:31.245392",
-      "version": "2.1.3"
+     "duration": 0.478528,
+     "end_time": "2021-04-23T10:42:33.002469",
+     "exception": false,
+     "start_time": "2021-04-23T10:42:32.523941",
+     "status": "completed"
     },
-    "colab": {
-      "provenance": []
-    }
+    "tags": []
+   },
+   "outputs": [],
+   "source": [
+    "# Using RLasso\n",
+    "dml_ar_rlasso = DML_AR_PLIV(rY=resy_rlasso, rD=resD_rlasso, rZ=resZ_rlasso, grid=np.arange(-2, 2, 0.01))\n",
+    "print(\"RLasso UB and LB: \", dml_ar_rlasso)\n",
+    "# Using RF\n",
+    "dml_ar_RF = DML_AR_PLIV(rY=resy_RF, rD=resD_RF, rZ=resZ_RF, grid=np.arange(-2, 2, 0.01))\n",
+    "print(\"RF UB and LB: \", dml_ar_RF)"
+   ]
+  }
+ ],
+ "metadata": {
+  "colab": {
+   "provenance": []
+  },
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.11.5"
   },
-  "nbformat": 4,
-  "nbformat_minor": 0
-}
\ No newline at end of file
+  "papermill": {
+   "duration": 62.79319,
+   "end_time": "2021-04-23T10:42:34.038582",
+   "environment_variables": {},
+   "exception": null,
+   "input_path": "__notebook__.ipynb",
+   "output_path": "__notebook__.ipynb",
+   "parameters": {},
+   "start_time": "2021-04-23T10:41:31.245392",
+   "version": "2.1.3"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 1
+}
diff --git a/AC2/python-dml-401k-IV.ipynb b/AC2/python-dml-401k-IV.ipynb
index b102302b..a4bb8b41 100644
--- a/AC2/python-dml-401k-IV.ipynb
+++ b/AC2/python-dml-401k-IV.ipynb
@@ -86,20 +86,17 @@
     "import pandas as pd\n",
     "import numpy as np\n",
     "import matplotlib.pyplot as plt\n",
-    "from sklearn.model_selection import train_test_split\n",
-    "from sklearn.model_selection import cross_val_score, cross_val_predict, KFold\n",
+    "from sklearn.model_selection import cross_val_predict, KFold\n",
     "from sklearn.pipeline import make_pipeline\n",
-    "from sklearn.linear_model import LassoCV, RidgeCV, ElasticNetCV, LinearRegression, Ridge, Lasso, LogisticRegressionCV\n",
+    "from sklearn.linear_model import LassoCV, LogisticRegressionCV\n",
     "from sklearn.preprocessing import StandardScaler\n",
     "from sklearn.ensemble import GradientBoostingRegressor, GradientBoostingClassifier\n",
     "from sklearn.ensemble import RandomForestRegressor, RandomForestClassifier\n",
     "from sklearn.tree import DecisionTreeRegressor, DecisionTreeClassifier\n",
-    "import patsy\n",
+    "import scipy.stats\n",
     "import warnings\n",
     "from sklearn.base import BaseEstimator, clone\n",
-    "import statsmodels.api as sm\n",
     "from IPython.display import Markdown\n",
-    "import os\n",
     "import wget\n",
     "import seaborn as sns\n",
     "warnings.simplefilter('ignore')\n",
@@ -154,7 +151,7 @@
    "source": [
     "readme = \"https://raw.githubusercontent.com/CausalAIBook/MetricsMLNotebooks/main/data/401k.md\"\n",
     "filename = wget.download(readme)\n",
-    "display(Markdown(open(filename, 'r').read()))"
+    "Markdown(open(filename, 'r').read())"
    ]
   },
   {
@@ -371,9 +368,10 @@
    },
    "outputs": [],
    "source": [
-    "from sklearn.base import TransformerMixin, BaseEstimator\n",
+    "from sklearn.base import TransformerMixin\n",
     "from formulaic import Formula\n",
     "\n",
+    "\n",
     "class FormulaTransformer(TransformerMixin, BaseEstimator):\n",
     "\n",
     "    def __init__(self, formula, array=False):\n",
@@ -502,8 +500,8 @@
    "outputs": [],
    "source": [
     "resy = y - modely.fit(X, y).predict(X)\n",
-    "resZ = Z - modelz.fit(X, Z).predict(X) # instrument is e401k (eligibility)\n",
-    "resD = D - modeld.fit(X, D).predict(X) # treatment is p401k (participation)"
+    "resZ = Z - modelz.fit(X, Z).predict(X)  # instrument is e401k (eligibility)\n",
+    "resD = D - modeld.fit(X, D).predict(X)  # treatment is p401k (participation)"
    ]
   },
   {
@@ -575,8 +573,8 @@
     "    resZ: the instrument residuals\n",
     "    epsilon: the final residual-on-residual OLS regression residual\n",
     "    '''\n",
-    "    cv = KFold(n_splits=nfolds, shuffle=True, random_state=123) # shuffled k-folds\n",
-    "    yhat = cross_val_predict(modely, X, y, cv=cv, n_jobs=-1) # out-of-fold predictions for y\n",
+    "    cv = KFold(n_splits=nfolds, shuffle=True, random_state=123)  # shuffled k-folds\n",
+    "    yhat = cross_val_predict(modely, X, y, cv=cv, n_jobs=-1)  # out-of-fold predictions for y\n",
     "    # out-of-fold predictions for D\n",
     "    # use predict or predict_proba dependent on classifier or regressor for D\n",
     "    if classifier:\n",
@@ -590,9 +588,9 @@
     "    resD = D - Dhat\n",
     "    resZ = Z - Zhat\n",
     "    # final stage ols based point estimate and standard error\n",
-    "    point = np.mean(resy * resZ) / np.mean(resD*resZ)\n",
+    "    point = np.mean(resy * resZ) / np.mean(resD * resZ)\n",
     "    epsilon = resy - point * resD\n",
-    "    var = np.mean(epsilon**2 * resZ**2) / np.mean(resD*resZ)**2\n",
+    "    var = np.mean(epsilon**2 * resZ**2) / np.mean(resD * resZ)**2\n",
     "    stderr = np.sqrt(var / X.shape[0])\n",
     "    return point, stderr, yhat, Dhat, Zhat, resy, resD, resZ, epsilon"
    ]
@@ -611,15 +609,15 @@
     "    Convenience summary function that takes the results of the DML function\n",
     "    and summarizes several estimation quantities and performance metrics.\n",
     "    '''\n",
-    "    return pd.DataFrame({'estimate': point, # point estimate\n",
-    "                         'stderr': stderr, # standard error\n",
-    "                         'lower': point - 1.96*stderr, # lower end of 95% confidence interval\n",
-    "                         'upper': point + 1.96*stderr, # upper end of 95% confidence interval\n",
-    "                         'rmse y': np.sqrt(np.mean(resy**2)), # RMSE of model that predicts outcome y\n",
-    "                         'rmse D': np.sqrt(np.mean(resD**2)), # RMSE of model that predicts treatment D\n",
-    "                         'rmse Z': np.sqrt(np.mean(resZ**2)), # RMSE of model that predicts treatment D\n",
-    "                         'accuracy D': np.mean(np.abs(resD) < .5), # binary classification accuracy of model for D\n",
-    "                         'accuracy Z': np.mean(np.abs(resZ) < .5), # binary classification accuracy of model for Z\n",
+    "    return pd.DataFrame({'estimate': point,  # point estimate\n",
+    "                         'stderr': stderr,  # standard error\n",
+    "                         'lower': point - 1.96 * stderr,  # lower end of 95% confidence interval\n",
+    "                         'upper': point + 1.96 * stderr,  # upper end of 95% confidence interval\n",
+    "                         'rmse y': np.sqrt(np.mean(resy**2)),  # RMSE of model that predicts outcome y\n",
+    "                         'rmse D': np.sqrt(np.mean(resD**2)),  # RMSE of model that predicts treatment D\n",
+    "                         'rmse Z': np.sqrt(np.mean(resZ**2)),  # RMSE of model that predicts treatment D\n",
+    "                         'accuracy D': np.mean(np.abs(resD) < .5),  # binary classification accuracy of model for D\n",
+    "                         'accuracy Z': np.mean(np.abs(resZ) < .5),  # binary classification accuracy of model for Z\n",
     "                         }, index=[name])"
    ]
   },
@@ -707,7 +705,7 @@
    },
    "outputs": [],
    "source": [
-    "table = pd.concat([table , summary(*result,  X, Z, D, y, name='lasso/logistic')])\n",
+    "table = pd.concat([table, summary(*result, X, Z, D, y, name='lasso/logistic')])\n",
     "table"
    ]
   },
@@ -745,7 +743,7 @@
    },
    "outputs": [],
    "source": [
-    "table = pd.concat([table , summary(*result,  X, Z, D, y, name='random forest')])\n",
+    "table = pd.concat([table, summary(*result, X, Z, D, y, name='random forest')])\n",
     "table"
    ]
   },
@@ -783,7 +781,7 @@
    },
    "outputs": [],
    "source": [
-    "table = pd.concat([table , summary(*result,  X, Z, D, y, name='decision tree')])\n",
+    "table = pd.concat([table, summary(*result, X, Z, D, y, name='decision tree')])\n",
     "table"
    ]
   },
@@ -821,7 +819,7 @@
    },
    "outputs": [],
    "source": [
-    "table = pd.concat([table , summary(*result,  X, Z, D, y, name='boosted forest')])\n",
+    "table = pd.concat([table, summary(*result, X, Z, D, y, name='boosted forest')])\n",
     "table"
    ]
   },
@@ -859,7 +857,7 @@
     "from flaml import AutoML\n",
     "\n",
     "flamly = make_pipeline(transformer, AutoML(time_budget=100, task='regression', early_stop=True,\n",
-    "                                    eval_method='cv', n_splits=3, metric='r2', verbose=0))\n",
+    "                                           eval_method='cv', n_splits=3, metric='r2', verbose=0))\n",
     "flamld = make_pipeline(transformer, AutoML(time_budget=100, task='classification', early_stop=True,\n",
     "                                           eval_method='cv', n_splits=3, metric='r2', verbose=0))\n",
     "flamlz = make_pipeline(transformer, AutoML(time_budget=100, task='classification', early_stop=True,\n",
@@ -926,7 +924,7 @@
    },
    "outputs": [],
    "source": [
-    "table = pd.concat([table , summary(*result,  X, Z, D, y, name='automl (semi-cfit)')])\n",
+    "table = pd.concat([table, summary(*result, X, Z, D, y, name='automl (semi-cfit)')])\n",
     "table"
    ]
   },
@@ -949,8 +947,6 @@
    },
    "outputs": [],
    "source": [
-    "import scipy.stats\n",
-    "\n",
     "def robust_inference(point, stderr, yhat, Dhat, Zhat, resy, resD, resZ, epsilon, X, Z, D, y, *, grid, alpha=0.05):\n",
     "    '''\n",
     "    Inference in the partially linear IV model that is robust to weak identification.\n",
@@ -962,7 +958,6 @@
     "    accept = []\n",
     "    for theta in grid:\n",
     "        moment = (resy - theta * resD) * resZ\n",
-    "        accept.append(np.sum(moment ** 2) < thr)\n",
     "        test = n * np.mean(moment)**2 / np.var(moment)\n",
     "        if test <= thr:\n",
     "            accept.append(theta)\n",
@@ -990,8 +985,7 @@
    },
    "outputs": [],
    "source": [
-    "filtered_region = [x for x in region if x != False]\n",
-    "np.min(filtered_region), np.max(filtered_region)"
+    "np.min(region), np.max(region)"
    ]
   },
   {
@@ -1134,16 +1128,16 @@
     "    # and a separate model for D==1.\n",
     "    for train, test in cv.split(X, y):\n",
     "        # train an outcome model on training data that received Z=0 and predict outcome on all data in test set\n",
-    "        yhat0[test] = clone(modely0).fit(X.iloc[train][Z[train]==0], y[train][Z[train]==0]).predict(X.iloc[test])\n",
+    "        yhat0[test] = clone(modely0).fit(X.iloc[train][Z[train] == 0], y[train][Z[train] == 0]).predict(X.iloc[test])\n",
     "        # train an outcome model on training data that received Z=1 and predict outcome on all data in test set\n",
-    "        yhat1[test] = clone(modely1).fit(X.iloc[train][Z[train]==1], y[train][Z[train]==1]).predict(X.iloc[test])\n",
+    "        yhat1[test] = clone(modely1).fit(X.iloc[train][Z[train] == 1], y[train][Z[train] == 1]).predict(X.iloc[test])\n",
     "        # train a treatment model on training data that received Z=0 and predict treatment on all data in test set\n",
-    "        if np.mean(D[train][Z[train]==0]) > 0: # it could be that D=0, whenever Z=0 deterministically\n",
-    "            modeld0_ = clone(modeld0).fit(X.iloc[train][Z[train]==0], D[train][Z[train]==0])\n",
+    "        if np.mean(D[train][Z[train] == 0]) > 0:  # it could be that D=0, whenever Z=0 deterministically\n",
+    "            modeld0_ = clone(modeld0).fit(X.iloc[train][Z[train] == 0], D[train][Z[train] == 0])\n",
     "            Dhat0[test] = modeld0_.predict_proba(X.iloc[test])[:, 1]\n",
     "        # train a treamtent model on training data that received Z=1 and predict treatment on all data in test set\n",
-    "        if np.mean(D[train][Z[train]==1]) < 1: # it could be that D=1, whenever Z=1 deterministically\n",
-    "            modeld1_ = clone(modeld1).fit(X.iloc[train][Z[train]==1], D[train][Z[train]==1])\n",
+    "        if np.mean(D[train][Z[train] == 1]) < 1:  # it could be that D=1, whenever Z=1 deterministically\n",
+    "            modeld1_ = clone(modeld1).fit(X.iloc[train][Z[train] == 1], D[train][Z[train] == 1])\n",
     "            Dhat1[test] = modeld1_.predict_proba(X.iloc[test])[:, 1]\n",
     "        else:\n",
     "            Dhat1[test] = 1\n",
@@ -1155,7 +1149,7 @@
     "    Zhat = cross_val_predict(modelz, X, Z, cv=cv, method='predict_proba', n_jobs=-1)[:, 1]\n",
     "    Zhat = np.clip(Zhat, trimming, 1 - trimming)\n",
     "    # doubly robust quantity for every sample\n",
-    "    HZ = Z/Zhat - (1 - Z)/(1 - Zhat)\n",
+    "    HZ = Z / Zhat - (1 - Z) / (1 - Zhat)\n",
     "    drZ = yhat1 - yhat0 + (y - yhat) * HZ\n",
     "    drD = Dhat1 - Dhat0 + (D - Dhat) * HZ\n",
     "    point = np.mean(drZ) / np.mean(drD)\n",
@@ -1180,15 +1174,15 @@
     "    Convenience summary function that takes the results of the DML function\n",
     "    and summarizes several estimation quantities and performance metrics.\n",
     "    '''\n",
-    "    return pd.DataFrame({'estimate': point, # point estimate\n",
-    "                         'stderr': stderr, # standard error\n",
-    "                         'lower': point - 1.96*stderr, # lower end of 95% confidence interval\n",
-    "                         'upper': point + 1.96*stderr, # upper end of 95% confidence interval\n",
-    "                         'rmse y': np.sqrt(np.mean(resy**2)), # RMSE of model that predicts outcome y\n",
-    "                         'rmse D': np.sqrt(np.mean(resD**2)), # RMSE of model that predicts treatment D\n",
-    "                         'rmse Z': np.sqrt(np.mean(resZ**2)), # RMSE of model that predicts treatment D\n",
-    "                         'accuracy D': np.mean(np.abs(resD) < .5), # binary classification accuracy of model for D\n",
-    "                         'accuracy Z': np.mean(np.abs(resZ) < .5), # binary classification accuracy of model for Z\n",
+    "    return pd.DataFrame({'estimate': point,  # point estimate\n",
+    "                         'stderr': stderr,  # standard error\n",
+    "                         'lower': point - 1.96 * stderr,  # lower end of 95% confidence interval\n",
+    "                         'upper': point + 1.96 * stderr,  # upper end of 95% confidence interval\n",
+    "                         'rmse y': np.sqrt(np.mean(resy**2)),  # RMSE of model that predicts outcome y\n",
+    "                         'rmse D': np.sqrt(np.mean(resD**2)),  # RMSE of model that predicts treatment D\n",
+    "                         'rmse Z': np.sqrt(np.mean(resZ**2)),  # RMSE of model that predicts treatment D\n",
+    "                         'accuracy D': np.mean(np.abs(resD) < .5),  # binary classification accuracy of model for D\n",
+    "                         'accuracy Z': np.mean(np.abs(resZ) < .5),  # binary classification accuracy of model for Z\n",
     "                         }, index=[name])"
    ]
   },
@@ -1245,7 +1239,7 @@
    },
    "outputs": [],
    "source": [
-    "tableiiv = pd.concat([tableiiv , summary(*result, X, Z, D, y, name='random forest')])\n",
+    "tableiiv = pd.concat([tableiiv, summary(*result, X, Z, D, y, name='random forest')])\n",
     "tableiiv"
    ]
   },
@@ -1273,7 +1267,7 @@
    },
    "outputs": [],
    "source": [
-    "tableiiv = pd.concat([tableiiv , summary(*result, X, Z, D, y, name='decision trees')])\n",
+    "tableiiv = pd.concat([tableiiv, summary(*result, X, Z, D, y, name='decision trees')])\n",
     "tableiiv"
    ]
   },
@@ -1301,7 +1295,7 @@
    },
    "outputs": [],
    "source": [
-    "tableiiv = pd.concat([tableiiv , summary(*result, X, Z, D, y, name='boosted trees')])\n",
+    "tableiiv = pd.concat([tableiiv, summary(*result, X, Z, D, y, name='boosted trees')])\n",
     "tableiiv"
    ]
   },
@@ -1326,11 +1320,11 @@
    "source": [
     "from flaml import AutoML\n",
     "flamly0 = make_pipeline(transformer, AutoML(time_budget=60, task='regression', early_stop=True,\n",
-    "                                     eval_method='cv', n_splits=3, metric='r2', verbose=0))\n",
+    "                                            eval_method='cv', n_splits=3, metric='r2', verbose=0))\n",
     "flamly1 = make_pipeline(transformer, AutoML(time_budget=60, task='regression', early_stop=True,\n",
-    "                                     eval_method='cv', n_splits=3, metric='r2', verbose=0))\n",
+    "                                            eval_method='cv', n_splits=3, metric='r2', verbose=0))\n",
     "flamld1 = make_pipeline(transformer, AutoML(time_budget=60, task='classification', early_stop=True,\n",
-    "                                           eval_method='cv', n_splits=3, metric='r2', verbose=0))\n",
+    "                                            eval_method='cv', n_splits=3, metric='r2', verbose=0))\n",
     "flamlz = make_pipeline(transformer, AutoML(time_budget=60, task='classification', early_stop=True,\n",
     "                                           eval_method='cv', n_splits=3, metric='r2', verbose=0))"
    ]
@@ -1344,7 +1338,7 @@
    },
    "outputs": [],
    "source": [
-    "flamly0.fit(X[Z==0], y[Z==0])\n",
+    "flamly0.fit(X[Z == 0], y[Z == 0])\n",
     "besty0 = make_pipeline(transformer, clone(flamly0[-1].best_model_for_estimator(flamly0[-1].best_estimator)))"
    ]
   },
@@ -1357,7 +1351,7 @@
    },
    "outputs": [],
    "source": [
-    "flamly1.fit(X[Z==1], y[Z==1])\n",
+    "flamly1.fit(X[Z == 1], y[Z == 1])\n",
     "besty1 = make_pipeline(transformer, clone(flamly1[-1].best_model_for_estimator(flamly1[-1].best_estimator)))"
    ]
   },
@@ -1371,7 +1365,7 @@
    "outputs": [],
    "source": [
     "from sklearn.dummy import DummyClassifier\n",
-    "bestd0 = DummyClassifier() # since D=0 whenever Z=0"
+    "bestd0 = DummyClassifier()  # since D=0 whenever Z=0"
    ]
   },
   {
@@ -1383,7 +1377,7 @@
    },
    "outputs": [],
    "source": [
-    "flamld1.fit(X[Z==1], D[Z==1])\n",
+    "flamld1.fit(X[Z == 1], D[Z == 1])\n",
     "bestd1 = make_pipeline(transformer, clone(flamld1[-1].best_model_for_estimator(flamld1[-1].best_estimator)))"
    ]
   },
@@ -1421,7 +1415,7 @@
    },
    "outputs": [],
    "source": [
-    "tableiiv = pd.concat([tableiiv , summary(*result, X, Z, D, y, name='automl (semi-cfit)')])\n",
+    "tableiiv = pd.concat([tableiiv, summary(*result, X, Z, D, y, name='automl (semi-cfit)')])\n",
     "tableiiv"
    ]
   },
@@ -1466,9 +1460,8 @@
    },
    "outputs": [],
    "source": [
-    "import scipy.stats\n",
-    "\n",
-    "def iivm_robust_inference(point, stderr, yhat, Dhat, Zhat, resy, resD, resZ, drZ, drD, X, Z, D, y, *, grid, alpha=0.05):\n",
+    "def iivm_robust_inference(point, stderr, yhat, Dhat, Zhat, resy, resD, resZ, drZ, drD, X, Z, D, y, *,\n",
+    "                          grid, alpha=0.05):\n",
     "    '''\n",
     "    Inference in the partially linear IV model that is robust to weak identification.\n",
     "    grid: grid of theta values to search over when trying to identify the confidence region\n",
@@ -1543,7 +1536,7 @@
     "# code verified with this version of econml\n",
     "# If you want to try out the latest econml version remove the version number;\n",
     "# albeit mistakes could potentially be raised due to library updates”\n",
-    "!pip install econml==0.14.1 #"
+    "!pip install econml==0.14.1"
    ]
   },
   {
@@ -1721,6 +1714,7 @@
    "source": [
     "import doubleml as dml\n",
     "\n",
+    "\n",
     "class RegWrapper(BaseEstimator):\n",
     "\n",
     "    def __init__(self, clf):\n",
@@ -1733,6 +1727,7 @@
     "    def predict(self, X):\n",
     "        return self.clf_.predict_proba(X)[:, 1]\n",
     "\n",
+    "\n",
     "dml_plr_obj = dml.DoubleMLPLIV(dml_data,\n",
     "                               LassoCV(cv=cv),\n",
     "                               RegWrapper(LogisticRegressionCV(cv=cv)),\n",
@@ -1752,6 +1747,7 @@
    "source": [
     "import doubleml as dml\n",
     "\n",
+    "\n",
     "class Wrapper(BaseEstimator):\n",
     "\n",
     "    def __init__(self, clf):\n",
@@ -1778,6 +1774,7 @@
     "    def predict(self, X):\n",
     "        return self.predict_proba(X)[:, 1] >= .5\n",
     "\n",
+    "\n",
     "dml_plr_obj = dml.DoubleMLIIVM(dml_data,\n",
     "                               LassoCV(cv=cv),\n",
     "                               Wrapper(LogisticRegressionCV(cv=cv)),\n",
diff --git a/AC2/python-weak-iv-experiments.ipynb b/AC2/python-weak-iv-experiments.ipynb
index 1f4e50cf..8f4a41df 100644
--- a/AC2/python-weak-iv-experiments.ipynb
+++ b/AC2/python-weak-iv-experiments.ipynb
@@ -1,174 +1,189 @@
 {
-  "nbformat": 4,
-  "nbformat_minor": 0,
-  "metadata": {
-    "colab": {
-      "provenance": []
-    },
-    "kernelspec": {
-      "name": "python3",
-      "display_name": "Python 3"
-    },
-    "language_info": {
-      "name": "python"
-    }
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "nBty5iAb_VyE"
+   },
+   "source": [
+    "# A Simple Example of Properties of IV estimator when Instruments are Weak"
+   ]
   },
-  "cells": [
-    {
-      "cell_type": "markdown",
-      "source": [
-        "# A Simple Example of Properties of IV estimator when Instruments are Weak"
-      ],
-      "metadata": {
-        "id": "nBty5iAb_VyE"
-      }
-    },
-    {
-      "cell_type": "code",
-      "execution_count": null,
-      "metadata": {
-        "id": "atRvUXGn_VFW"
-      },
-      "outputs": [],
-      "source": [
-        "import numpy as np\n",
-        "from scipy.stats import norm\n",
-        "from scipy.stats import gaussian_kde\n",
-        "import statsmodels.api as sm\n",
-        "from statsmodels.sandbox.regression.gmm import IV2SLS\n",
-        "import matplotlib.pyplot as plt"
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "source": [
-        "Simulation Design"
-      ],
-      "metadata": {
-        "id": "wLevLetuBzrs"
-      }
-    },
-    {
-      "cell_type": "code",
-      "source": [
-        "np.random.seed(123)\n",
-        "\n",
-        "n = 100\n",
-        "beta = 0.1  # 0.1 weak IV\n",
-        "# beta = 1  # 1 strong IV\n",
-        "\n",
-        "# One realization\n",
-        "U = norm.rvs(size=n)\n",
-        "Z = norm.rvs(size=n)  # generate instrument\n",
-        "D = beta * Z + U      # generate endogenous variable\n",
-        "Y = D + U             # the true causal effect is 1\n",
-        "\n",
-        "# First stage regression\n",
-        "Z1 = sm.add_constant(Z)\n",
-        "model_first_stage = sm.OLS(D, Z1)\n",
-        "results_first_stage = model_first_stage.fit()\n",
-        "print(results_first_stage.summary())  # first stage is very weak here when we set beta = .1\n",
-        "\n",
-        "# IV regression\n",
-        "D1 = sm.add_constant(D)\n",
-        "model_iv = IV2SLS(Y, D1, Z1)\n",
-        "results_iv = model_iv.fit()\n",
-        "print(results_iv.summary())"
-      ],
-      "metadata": {
-        "id": "7oacBDrQB1AU"
-      },
-      "execution_count": null,
-      "outputs": []
-    },
-    {
-      "cell_type": "markdown",
-      "source": [
-        "Note that the instrument is weak here (strength of the instrument is controlled by setting $\\beta$) -- the t-stat is smaller than any rule-of-thumb suggested in the literature (e.g. $\\sqrt{10}$) ."
-      ],
-      "metadata": {
-        "id": "-Y7yS_mvGyt9"
-      }
-    },
-    {
-      "cell_type": "markdown",
-      "source": [
-        "# Run 10000 trials to evaluate distribution of the IV estimator"
-      ],
-      "metadata": {
-        "id": "f4wBKUdTG0O2"
-      }
-    },
-    {
-      "cell_type": "code",
-      "source": [
-        "np.random.seed(123)\n",
-        "\n",
-        "B= 10000 # trials\n",
-        "IVEst = np.zeros(B)\n",
-        "\n",
-        "for i in range(B):\n",
-        "    U = norm.rvs(size=n)\n",
-        "    Z = norm.rvs(size=n)  # generate instrument\n",
-        "    D = beta * Z + U  # generate endogenous variable\n",
-        "    Y = D + U  # the true causal effect is 1\n",
-        "\n",
-        "    # IV regression\n",
-        "    model_iv = IV2SLS(Y, D, Z)\n",
-        "    results_iv = model_iv.fit()\n",
-        "    IVEst[i] = results_iv.params[0]  # Coefficient of D\n",
-        "\n",
-        "\n"
-      ],
-      "metadata": {
-        "id": "el9qoJqbG258"
-      },
-      "execution_count": null,
-      "outputs": []
-    },
-    {
-      "cell_type": "markdown",
-      "source": [
-        "# Plot the Actual Distribution against the Normal Approximation (based on Strong Instrument Assumption)"
-      ],
-      "metadata": {
-        "id": "aXLgc8TkIzYf"
-      }
-    },
-    {
-      "cell_type": "code",
-      "source": [
-        "# Plotting the density of IVEst\n",
-        "plt.figure(figsize=(8, 6))\n",
-        "plt.xlim(-5, 5)\n",
-        "plt.xlabel(\"IV Estimator - True Effect\")\n",
-        "plt.title(\"Actual Distribution vs Gaussian\")\n",
-        "\n",
-        "# Plotting density estimate of simulated IV coefficients\n",
-        "val = np.linspace(-5, 5, 200)\n",
-        "kde = gaussian_kde(IVEst - 1, bw_method = .001) # Need to play with bandwidth depending on problem features.\n",
-        "density = kde(np.linspace(-5, 5, 1000))\n",
-        "plt.plot(np.linspace(-5, 5, 1000), density, color='blue', label='Actual Distribution')\n",
-        "\n",
-        "# Plotting Gaussian distribution\n",
-        "var = (1 / (beta ** 2)) * (1 / n)  # theoretical variance of IV\n",
-        "sd = np.sqrt(var)\n",
-        "gaussian = norm.pdf(val, scale=sd)\n",
-        "plt.plot(val, gaussian, color='red', linestyle='--', label='Gaussian Distribution')\n",
-        "\n",
-        "plt.legend()\n",
-        "\n",
-        "# Calculating rejection frequency\n",
-        "rejection_frequency = np.sum(np.abs(IVEst - 1) / sd > 1.96) / B\n",
-        "print(\"Rejection Frequency is\", rejection_frequency, \"while we expect it to be around 0.05\")\n",
-        "\n",
-        "plt.show()"
-      ],
-      "metadata": {
-        "id": "DDr-glMeI2Fk"
-      },
-      "execution_count": null,
-      "outputs": []
-    }
-  ]
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "id": "atRvUXGn_VFW"
+   },
+   "outputs": [],
+   "source": [
+    "import numpy as np\n",
+    "from scipy.stats import norm\n",
+    "from scipy.stats import gaussian_kde\n",
+    "import statsmodels.api as sm\n",
+    "from statsmodels.sandbox.regression.gmm import IV2SLS\n",
+    "import matplotlib.pyplot as plt"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "wLevLetuBzrs"
+   },
+   "source": [
+    "Simulation Design"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "id": "7oacBDrQB1AU"
+   },
+   "outputs": [],
+   "source": [
+    "np.random.seed(123)\n",
+    "\n",
+    "n = 100\n",
+    "beta = 0.1  # 0.1 weak IV\n",
+    "# beta = 1  # 1 strong IV\n",
+    "\n",
+    "# One realization\n",
+    "U = norm.rvs(size=n)\n",
+    "Z = norm.rvs(size=n)  # generate instrument\n",
+    "D = beta * Z + U      # generate endogenous variable\n",
+    "Y = D + U             # the true causal effect is 1\n",
+    "\n",
+    "# First stage regression\n",
+    "Z1 = sm.add_constant(Z)\n",
+    "model_first_stage = sm.OLS(D, Z1)\n",
+    "results_first_stage = model_first_stage.fit()\n",
+    "print(results_first_stage.summary())  # first stage is very weak here when we set beta = .1\n",
+    "\n",
+    "# IV regression\n",
+    "D1 = sm.add_constant(D)\n",
+    "model_iv = IV2SLS(Y, D1, Z1)\n",
+    "results_iv = model_iv.fit()\n",
+    "print(results_iv.summary())"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "-Y7yS_mvGyt9"
+   },
+   "source": [
+    "Note that the instrument is weak here (strength of the instrument is controlled by setting $\\beta$) -- the t-stat is smaller than any rule-of-thumb suggested in the literature (e.g. $\\sqrt{10}$) ."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "f4wBKUdTG0O2"
+   },
+   "source": [
+    "# Run 10000 trials to evaluate distribution of the IV estimator"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "id": "el9qoJqbG258"
+   },
+   "outputs": [],
+   "source": [
+    "np.random.seed(123)\n",
+    "\n",
+    "B = 10000  # trials\n",
+    "IVEst = np.zeros(B)\n",
+    "\n",
+    "for i in range(B):\n",
+    "    U = norm.rvs(size=n)\n",
+    "    Z = norm.rvs(size=n)  # generate instrument\n",
+    "    D = beta * Z + U  # generate endogenous variable\n",
+    "    Y = D + U  # the true causal effect is 1\n",
+    "\n",
+    "    # IV regression\n",
+    "    model_iv = IV2SLS(Y, D, Z)\n",
+    "    results_iv = model_iv.fit()\n",
+    "    IVEst[i] = results_iv.params[0]  # Coefficient of D"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {
+    "id": "aXLgc8TkIzYf"
+   },
+   "source": [
+    "# Plot the Actual Distribution against the Normal Approximation (based on Strong Instrument Assumption)"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {
+    "id": "DDr-glMeI2Fk"
+   },
+   "outputs": [],
+   "source": [
+    "# Plotting the density of IVEst\n",
+    "plt.figure(figsize=(8, 6))\n",
+    "plt.xlim(-5, 5)\n",
+    "plt.xlabel(\"IV Estimator - True Effect\")\n",
+    "plt.title(\"Actual Distribution vs Gaussian\")\n",
+    "\n",
+    "# Plotting density estimate of simulated IV coefficients\n",
+    "val = np.linspace(-5, 5, 200)\n",
+    "kde = gaussian_kde(IVEst - 1, bw_method=.001)  # Need to play with bandwidth depending on problem features.\n",
+    "density = kde(np.linspace(-5, 5, 1000))\n",
+    "plt.plot(np.linspace(-5, 5, 1000), density, color='blue', label='Actual Distribution')\n",
+    "\n",
+    "# Plotting Gaussian distribution\n",
+    "var = (1 / (beta ** 2)) * (1 / n)  # theoretical variance of IV\n",
+    "sd = np.sqrt(var)\n",
+    "gaussian = norm.pdf(val, scale=sd)\n",
+    "plt.plot(val, gaussian, color='red', linestyle='--', label='Gaussian Distribution')\n",
+    "\n",
+    "plt.legend()\n",
+    "\n",
+    "# Calculating rejection frequency\n",
+    "rejection_frequency = np.sum(np.abs(IVEst - 1) / sd > 1.96) / B\n",
+    "print(\"Rejection Frequency is\", rejection_frequency, \"while we expect it to be around 0.05\")\n",
+    "\n",
+    "plt.show()"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "colab": {
+   "provenance": []
+  },
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.11.5"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 1
 }