Merge pull request #10 from spestana/main

Updates to xgboost tutorial
geo-smart · Aug 15, 2024 · c95dd8a · c95dd8a
2 parents 75c97b0 + cd187d7
commit c95dd8a
Show file tree

Hide file tree

Showing 4 changed files with 6 additions and 4 deletions.
diff --git a/book/tutorials/decision_trees/01.script/01.tutorial_post_processing_xgboost_tuning.ipynb b/book/tutorials/decision_trees/01.script/01.tutorial_post_processing_xgboost_tuning.ipynb
@@ -115,7 +115,7 @@
     }
    },
    "source": [
-    "## 4. Data Preprocessing\n",
+    "### 4. Data Preprocessing\n",
     "#### 4.1. Overview of the USGS Stream Station\n",
     "- The dataset that we will use provides the data for seven GSL watershed stations. \n",
     "- The dataset contains climate variables, such as precipitation and temperature, water infrastructure, storage percentage, and watershed characteristics, such as average area and elevation. \n",
@@ -486,7 +486,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "## 5. Model Development \n",
+    "### 5. Model Development \n",
     "#### 5.1. Defining the XGBoost Model \n",
     "As mentioned, we will use XGBoost in our tutorial, and we will use the  [dmlc XGBoost package](https://xgboost.readthedocs.io/en/stable/). Understanding and tuning the model parameters is critical in any ML model development since it will affect the final model performance. The XGBoost model has different parameters, and here, we will work on the three most important parameters of XGBoost:\n",
     "  \n",
@@ -609,7 +609,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "#### !!!! Don't forget to train and save your model after tuning the hyperparameters as a Pickle file.\n"
+    "***!!!! Don't forget to train and save your model after tuning the hyperparameters as a Pickle file.***\n"
    ]
   },
   {

diff --git a/...rials/decision_trees/01.script/02.tutorial_post_processing_xgboost_automatic_tuning.ipynb b/...rials/decision_trees/01.script/02.tutorial_post_processing_xgboost_automatic_tuning.ipynb
@@ -72,6 +72,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
+    "### 5. Model Development Continued\n",
     "#### 5.2. Scaling the Data\n",
     "Generally, scaling the inputs is not required in decision-tree ensemble models. However, some studies suggest scaling the inputs since XGBoost uses the Gradient Decent algorithm in its core optimization. So here we will try both \n",
     "scaled and unscaled inputs to see the difference.\n",

diff --git a/book/tutorials/decision_trees/01.script/03.tutorial_post_processing_xgboost_evaluation.ipynb b/book/tutorials/decision_trees/01.script/03.tutorial_post_processing_xgboost_evaluation.ipynb
@@ -98,6 +98,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
+    "### 5. Model Development Continued\n",
     "#### 5.5. Testing the Model\n",
     "We will give the model the test set for each station and compare it with the observation to evaluate the model with a dataset it has not seen before. Before feeding the test data we load the model. "
    ]

diff --git a/book/tutorials/index.md b/book/tutorials/index.md
@@ -6,4 +6,4 @@ Below you'll find a table keeping track of all tutorials presented at this event
 
 | Tutorial | Topics | Datasets |  Recording Link |
 | -  | - | - |  - |
-| [Machine Learning for Post-Processing NWM Data](./decision_trees/01.script/01.tutorial_post_processing_xgboost_tuning.ipynb) | Decision trees and XGBoost | n/a |  Not recorded |
+| [Machine Learning for Post-Processing NWM Data](./decision_trees/01.script/00.tutorial_post_processing_xgboost_intro.md) | Decision trees and XGBoost | n/a |  Not recorded |