Skip to content

Commit

Permalink
update
Browse files Browse the repository at this point in the history
  • Loading branch information
sebffischer committed Jan 31, 2025
1 parent 1216076 commit fccb6d9
Show file tree
Hide file tree
Showing 14 changed files with 392 additions and 417 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
"hash": "6afa555cbebd20dcf7d2176d38060953",
"result": {
"engine": "knitr",
"markdown": "---\ntitle: \"Training Efficiency\"\nsolutions: true\n---\n\n---\ntitle: \"Training Efficiency\"\n---\n\n\n\n\n\n\n**Question 1:** Validation\n\nIn this exercise, we will once again train a simple multi-layer perceptron on the *Indian Liver Patient Dataset* (ILPD). Create a learner that:\n\n1. Uses 2 hidden layers with 100 neurons each.\n2. Utilizes a batch size of 128.\n3. Trains for 200 epochs.\n4. Employs a validation set comprising 30% of the data.\n5. Tracks the training and validation log-loss during training.\n6. Utilizes trace-jitting to speed up the training process.\n7. Employs the history callback to record the training and validation log-loss during training.\n\nAfterward, plot the validation log-loss, which is accessible via `learner$model$callbacks$history`.\n\nBelow, we create the task and remove the `gender` feature for simplicity.\n\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code}\nlibrary(mlr3verse)\n```\n\n::: {.cell-output .cell-output-stderr}\n\n```\nLoading required package: mlr3\n```\n\n\n:::\n\n```{.r .cell-code}\nlibrary(mlr3torch)\n```\n\n::: {.cell-output .cell-output-stderr}\n\n```\nLoading required package: mlr3pipelines\n```\n\n\n:::\n\n::: {.cell-output .cell-output-stderr}\n\n```\nLoading required package: torch\n```\n\n\n:::\n\n```{.r .cell-code}\nilpd_num <- tsk(\"ilpd\")\nilpd_num$select(setdiff(ilpd_num$feature_names, \"gender\"))\nilpd_num\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n<TaskClassif:ilpd> (583 x 10): Indian Liver Patient Data\n* Target: diseased\n* Properties: twoclass\n* Features (9):\n - dbl (5): albumin, albumin_globulin_ratio, direct_bilirubin, total_bilirubin, total_protein\n - int (4): age, alanine_transaminase, alkaline_phosphatase, aspartate_transaminase\n```\n\n\n:::\n:::\n\n\n<details>\n<summary>Hint</summary>\n* To specify the validation set, use the `validate` field, which can either be set during construction or by calling `$configure()`.\n* Trace-jitting can be enabled via the `jit_trace` parameter.\n* The history callback can be constructed via `t_clbk(\"history\")` and needs to be passed during the *construction* of the learner.\n* The validation and measures can be specified via `measures_valid` and take a measure object that is constructed via `msr()`.\n</details>\n\n::: {.content-visible when-meta=solutions}\n**Solution**\n\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code}\nlibrary(ggplot2)\n\nmlp <- lrn(\"classif.mlp\",\n neurons = c(100, 100),\n batch_size = 128,\n epochs = 200,\n predict_type = \"prob\",\n validate = 0.3,\n jit_trace = TRUE,\n callbacks = t_clbk(\"history\"),\n measures_valid = msr(\"classif.logloss\")\n)\n\nmlp$train(ilpd_num)\nhead(mlp$model$callbacks$history)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n epoch valid.classif.logloss\n <num> <num>\n1: 1 3.373034\n2: 2 5.475234\n3: 3 4.667771\n4: 4 3.047842\n5: 5 1.563049\n6: 6 0.958690\n```\n\n\n:::\n\n```{.r .cell-code}\nggplot(mlp$model$callbacks$history) +\n geom_line(aes(x = epoch, y = valid.classif.logloss)) +\n labs(\n y = \"Log-Loss (Validation)\",\n x = \"Epoch\"\n ) +\n theme_minimal()\n```\n\n::: {.cell-output-display}\n![](6-training-efficiency-exercise-solution_files/figure-html/unnamed-chunk-3-1.png){fig-align='center' width=672}\n:::\n:::\n\n:::\n\n**Question 2:** Early Stopping\nEnable early stopping to prevent overfitting and re-train the learner (using a patience of 10). Print the final validation performance of the learner and the early stopped results. You can consult the [documentation of `LearnerTorch`](https://mlr3torch.mlr-org.com/reference/mlr_learners_torch.html) on how to access these two results (section *Active Bindings*).\n\n<details>\n<summary>Hint</summary>\nYou can enable early stopping by setting the `patience` parameter.\n</details>\n\n::: {.content-visible when-meta=solutions}\n**Solution**\n\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code}\nmlp$configure(\n patience = 10\n)\nmlp$train(ilpd_num)\nmlp$internal_tuned_values\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n$epochs\n[1] 24\n```\n\n\n:::\n\n```{.r .cell-code}\nmlp$internal_valid_scores\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n$classif.logloss\n[1] 0.5598296\n```\n\n\n:::\n:::\n\n:::\n\n**Question 3:** Early Stopping and Dropout Tuning\n\nWhile early stopping in itself is already useful, `mlr3torch` also allows you to simultaneously tune the number of epochs using early stopping while tuning other hyperparameters via traditional hyperparameter tuning from `mlr3tuning`.\n\nOne thing we have not covered so far is that the MLP learner we have used so far also uses a dropout layer. The dropout probability can be configured via the `p` parameter.\n\nYour task is to tune the dropout probability `p` in the range $[0, 1]$ and the epochs using early stopping (using the configuration from the previous exercise).\n\nTo adapt this to work with early stopping, you need to set the:\n\n1. `epochs` to `to_tune(upper = <value>, internal = TRUE)`: This tells the `Tuner` that the learner will tune the number of epochs itself.\n2. `$validate` field of the `\"test\"` so the same data is used for tuning and validation.\n3. Tuning `measure` to `msr(\"internal_valid_score\", minimize = TRUE)`. We set `minimize` to `TRUE` because we have used the log-loss as a validation measure.\n\nApart from this, the tuning works just like in tutorial 5. Use 3-fold cross-validation for the tuning and evaluate 10 configurations.\n\nRun the tuning and print the optimal configuration.\n\n::: {.content-visible when-meta=solutions}\n**Solution**\n\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code}\nlibrary(mlr3torch)\n\nmlp$configure(\n epochs = to_tune(upper = 100, internal = TRUE),\n p = to_tune(lower = 0, upper = 1),\n validate = \"test\"\n)\n\ntuner <- tnr(\"random_search\")\nresampling <- rsmp(\"cv\", folds = 3)\nmeasure <- msr(\"internal_valid_score\", minimize = TRUE)\n\nti <- tune(\n tuner = tuner,\n task = ilpd_num,\n learner = mlp,\n resampling = resampling,\n measure = measure,\n term_evals = 10\n)\n\nti$learner_result_param_vals\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\nNULL\n```\n\n\n:::\n:::\n\n:::\n\n",
"markdown": "---\ntitle: \"Training Efficiency\"\nsolutions: true\n---\n\n---\ntitle: \"Training Efficiency\"\n---\n\n\n\n\n\n\n**Question 1:** Validation\n\nIn this exercise, we will once again train a simple multi-layer perceptron on the *Indian Liver Patient Dataset* (ILPD). Create a learner that:\n\n1. Uses 2 hidden layers with 100 neurons each.\n2. Utilizes a batch size of 128.\n3. Trains for 200 epochs.\n4. Employs a validation set comprising 30% of the data.\n5. Track the validation log-loss.\n6. Utilizes trace-jitting to speed up the training process.\n7. Employs the history callback to record the training and validation log-loss during training.\n\nAfterward, plot the validation log-loss, which is accessible via `learner$model$callbacks$history`.\n\nBelow, we create the task and remove the `gender` feature again for simplicity.\n\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code}\nlibrary(mlr3verse)\nlibrary(mlr3torch)\nilpd_num <- tsk(\"ilpd\")\nilpd_num$select(setdiff(ilpd_num$feature_names, \"gender\"))\nilpd_num\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n<TaskClassif:ilpd> (583 x 10): Indian Liver Patient Data\n* Target: diseased\n* Properties: twoclass\n* Features (9):\n - dbl (5): albumin, albumin_globulin_ratio, direct_bilirubin, total_bilirubin, total_protein\n - int (4): age, alanine_transaminase, alkaline_phosphatase, aspartate_transaminase\n```\n\n\n:::\n:::\n\n\n::: {.content-visible when-meta=solutions}\n**Solution**\n\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code}\nlibrary(ggplot2)\n\nmlp <- lrn(\"classif.mlp\",\n neurons = c(100, 100),\n batch_size = 128,\n epochs = 200,\n predict_type = \"prob\",\n validate = 0.3,\n jit_trace = TRUE,\n callbacks = t_clbk(\"history\"),\n measures_valid = msr(\"classif.logloss\")\n)\n\nmlp$train(ilpd_num)\nhead(mlp$model$callbacks$history)\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n epoch valid.classif.logloss\n <num> <num>\n1: 1 3.373034\n2: 2 5.475234\n3: 3 4.667771\n4: 4 3.047842\n5: 5 1.563049\n6: 6 0.958690\n```\n\n\n:::\n\n```{.r .cell-code}\nggplot(mlp$model$callbacks$history) +\n geom_line(aes(x = epoch, y = valid.classif.logloss)) +\n labs(\n y = \"Log-Loss (Validation)\",\n x = \"Epoch\"\n ) +\n theme_minimal()\n```\n\n::: {.cell-output-display}\n![](6-training-efficiency-exercise-solution_files/figure-html/unnamed-chunk-3-1.png){fig-align='center' width=672}\n:::\n:::\n\n:::\n\n**Question 2:** Early Stopping\n\nEnable early stopping to prevent overfitting and re-train the learner (using a patience of 10). Print the final validation performance of the learner and the early stopped results. You can consult the [documentation of `LearnerTorch`](https://mlr3torch.mlr-org.com/reference/mlr_learners_torch.html) on how to access these (see section *Active Bindings*).\n\n<details>\n<summary>Hint</summary>\nYou can enable early stopping by setting the `patience` parameter.\n</details>\n\n::: {.content-visible when-meta=solutions}\n**Solution**\n\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code}\nmlp$configure(\n patience = 10\n)\nmlp$train(ilpd_num)\nmlp$internal_tuned_values\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n$epochs\n[1] 24\n```\n\n\n:::\n\n```{.r .cell-code}\nmlp$internal_valid_scores\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n$classif.logloss\n[1] 0.5598296\n```\n\n\n:::\n:::\n\n:::\n\n**Question 3:** Early Stopping and Dropout Tuning\n\nWhile early stopping in itself is already useful, `mlr3torch` also allows you to simultaneously tune the number of epochs using early stopping while tuning other hyperparameters via traditional hyperparameter tuning from `mlr3tuning`.\n\nOne thing we have not mentioned so far is that the MLP learner also uses a dropout layer.\nThe dropout probability can be configured via the `p` parameter.\n\nYour task is to tune the dropout probability `p` in the range $[0, 1]$ and the epochs using early stopping (using the configuration from the previous exercise) with an upper bound of 100 epochs.\n\nTo adapt this to work with early stopping, you need to set the:\n\n1. `epochs` to `to_tune(upper = <value>, internal = TRUE)`: This tells the `Tuner` that the learner will tune the number of epochs itself.\n2. `$validate` field of the `\"test\"` so the same data is used for tuning and validation.\n3. Tuning `measure` to `msr(\"internal_valid_score\", minimize = TRUE)`. We set `minimize` to `TRUE` because we have used the log-loss as a validation measure.\n\nApart from this, the tuning works just like in tutorial 5. Use 3-fold cross-validation and evaluate 10 configurations using random search.\nFinally, print the optimal configuration.\n\n::: {.content-visible when-meta=solutions}\n**Solution**\n\n\n::: {.cell layout-align=\"center\"}\n\n```{.r .cell-code}\nlibrary(mlr3torch)\n\nmlp$configure(\n epochs = to_tune(upper = 100, internal = TRUE),\n p = to_tune(lower = 0, upper = 1),\n validate = \"test\"\n)\n\ntuner <- tnr(\"random_search\")\nresampling <- rsmp(\"cv\", folds = 3)\nmeasure <- msr(\"internal_valid_score\", minimize = TRUE)\n\nti <- tune(\n tuner = tuner,\n task = ilpd_num,\n learner = mlp,\n resampling = resampling,\n measure = measure,\n term_evals = 10\n)\n\nti$result_learner_param_vals\n```\n\n::: {.cell-output .cell-output-stdout}\n\n```\n$epochs\n[1] 53\n\n$device\n[1] \"auto\"\n\n$num_threads\n[1] 1\n\n$num_interop_threads\n[1] 1\n\n$seed\n[1] \"random\"\n\n$jit_trace\n[1] TRUE\n\n$eval_freq\n[1] 1\n\n$measures_train\nlist()\n\n$measures_valid\nlist()\n\n$patience\n[1] 0\n\n$min_delta\n[1] 0\n\n$batch_size\n[1] 128\n\n$neurons\n[1] 100 100\n\n$p\n[1] 0.3738756\n\n$activation\n<nn_relu> object generator\n Inherits from: <inherit>\n Public:\n .classes: nn_relu nn_module\n initialize: function (inplace = FALSE) \n forward: function (input) \n clone: function (deep = FALSE, ..., replace_values = TRUE) \n Private:\n .__clone_r6__: function (deep = FALSE) \n Parent env: <environment: 0x12f15a7b8>\n Locked objects: FALSE\n Locked class: FALSE\n Portable: TRUE\n\n$activation_args\nlist()\n```\n\n\n:::\n:::\n\n:::\n\n",
"supporting": [
"6-training-efficiency-exercise-solution_files"
],
Expand Down
Loading

0 comments on commit fccb6d9

Please sign in to comment.