working on troubleshooting tutorial

nipype · Feb 3, 2025 · 54dc092 · 54dc092
1 parent 30d0a7c
commit 54dc092
Showing 1 changed file with 49 additions and 31 deletions.
diff --git a/new-docs/source/tutorial/3-troubleshooting.ipynb b/new-docs/source/tutorial/3-troubleshooting.ipynb
@@ -10,6 +10,24 @@
     "avoid common pitfalls."
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "\n",
+    "## Things to check if Pydra gets stuck\n",
+    "\n",
+    "I There are a number of common gotchas, related to running multi-process code, that can\n",
+    "cause Pydra workflows to get stuck and not execute correctly. If using the concurrent\n",
+    "futures worker (e.g. `worker=\"cf\"`), check these issues first before filing a bug report\n",
+    "or reaching out for help.\n",
+    "\n",
+    "### Applying `nest_asyncio` when running within a notebook\n",
+    "\n",
+    "When using the concurrent futures worker within a Jupyter notebook you need to apply\n",
+    "`nest_asyncio` with the following lines"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 3,
@@ -25,21 +43,11 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
+    "### Enclosing multi-process code within `if __name__ == \"__main__\"`\n",
     "\n",
-    "## Things to check first\n",
-    "\n",
-    "### Running in *debug* mode\n",
-    "\n",
-    "By default, Pydra will run with the *debug* worker, which executes each task serially\n",
-    "within a single process without use of `async/await` blocks, to allow raised exceptions\n",
-    "to propagate gracefully to the calling code. If you are having trouble with a pipeline,\n",
-    "ensure that `worker=debug` is passed to the submission/execution call (the default).\n",
-    "\n",
-    "\n",
-    "## Enclosing multi-process code within `if __name__ == \"__main__\"`\n",
-    "\n",
-    "If using the concurrent futures worker (`worker=\"cf\"`) on macOS or Windows, then you need\n",
-    "to enclose top-level scripts within `if __name__ == \"__main__\"` blocks, e.g."
+    "If running a script that executes a workflow with the concurrent futures worker\n",
+    "(i.e. `worker=\"cf\"`) on macOS or Windows, then the submissing/execution call needs to\n",
+    "be enclosed within a `if __name__ == \"__main__\"` blocks, e.g."
    ]
   },
   {
@@ -63,7 +71,6 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "\n",
     "### Remove stray lockfiles\n",
     "\n",
     "During the execution of a task, a lockfile is generated to signify that a task is running.\n",
@@ -77,14 +84,27 @@
     "If the  `clean_stale_locks` flag is set (by default when using the *debug* worker), locks that\n",
     "were created before the outer task was submitted are removed before the task is run.\n",
     "However, since these locks could be created by separate submission processes, ``clean_stale_locks`\n",
-    "is not switched on by default when using production workers (e.g. `cf`, `slurm`, etc...).\n",
+    "is not switched on by default when using production workers (e.g. `cf`, `slurm`, etc...)."
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "## Finding errors\n",
+    "\n",
+    "### Running in *debug* mode\n",
+    "\n",
+    "By default, Pydra will run with the *debug* worker, which executes each task serially\n",
+    "within a single process without use of `async/await` blocks, to allow raised exceptions\n",
+    "to propagate gracefully to the calling code. If you are having trouble with a pipeline,\n",
+    "ensure that `worker=debug` is passed to the submission/execution call (the default).\n",
     "\n",
-    "## Locating error messages\n",
+    "### Reading error files\n",
     "\n",
-    "If running in debug mode (the default), runtime exceptions will be raised to the\n",
-    "call shell or debugger. However, when using asynchronous workers the errors will\n",
-    "be saved in `_error.pklz` pickle files inside the task's cache directory. For\n",
-    "example, given the following toy example"
+    "When a task raises an error, it is captured and saved in pickle file named `_error.pklz`\n",
+    "within task's cache directory. For example, when calling the toy `UnsafeDivisionWorkflow`\n",
+    "with a `denominator=0`, the task will fail."
    ]
   },
   {
@@ -93,18 +113,11 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "from pydra.tasks.testing import UnsafeDivisionWorkflow\n",
-    "from pydra.engine.submitter import Submitter\n",
-    "import nest_asyncio\n",
-    "\n",
-    "# This is needed to run parallel workflows in Jupyter notebooks\n",
-    "nest_asyncio.apply()\n",
-    "\n",
     "# This workflow will fail because we are trying to divide by 0\n",
-    "failing_workflow = UnsafeDivisionWorkflow(a=10, b=5).split(denominator=[3, 2 ,0])\n",
+    "wf = UnsafeDivisionWorkflow(a=10, b=5).split(denominator=[3, 2 ,0])\n",
     "\n",
     "with Submitter(worker=\"cf\") as sub:\n",
-    "    result = sub(failing_workflow)\n",
+    "    result = sub(wf)\n",
     "    \n",
     "if result.errored:\n",
     "    print(\"Workflow failed with errors:\\n\" + str(result.errors))\n",
@@ -122,7 +135,12 @@
     "the novel nature and of scientific experiments and known artefacts that can occur.\n",
     "Therefore, it is always to sanity-check results produced by workflows. When a problem\n",
     "occurs in a multi-stage workflow it can be difficult to identify at which stage the\n",
-    "issue occurred."
+    "issue occurred.\n",
+    "\n",
+    "Currently in Pydra you need to step backwards through the tasks of the workflow, load\n",
+    "the saved task object and inspect its inputs to find the preceding nodes. If any of the\n",
+    "inputs that have been generated by previous nodes are not ok, then you should check the\n",
+    "tasks that generated them in turn."
    ]
   },
   {