You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
That is, we have another step called "sleep" that does not do anything. However, when a user asks for job logs, one gets the previous Dask ones:
$ rcg logs -w dask-coffea-serial-kubernetes --filter step=sleep | tail -102025-01-27 13:08:50,599 - distributed._signals - INFO - Received signal SIGTERM (15)2025-01-27 13:08:50,599 - distributed.nanny - INFO - Closing Nanny at 'tcp://10.244.0.194:35675'. Reason: nanny-close2025-01-27 13:08:50,599 - distributed.nanny - INFO - Nanny asking worker to close. Reason: nanny-close2025-01-27 13:08:50,599 - distributed._signals - INFO - Received signal SIGTERM (15)2025-01-27 13:08:50,599 - distributed.nanny - INFO - Closing Nanny at 'tcp://10.244.0.193:38655'. Reason: nanny-close2025-01-27 13:08:50,599 - distributed.nanny - INFO - Nanny asking worker to close. Reason: nanny-close2025-01-27 13:08:50,600 - distributed.nanny - INFO - Worker closed2025-01-27 13:08:50,821 - distributed.dask_worker - INFO - End worker2025-01-27 13:08:51,341 - distributed.dask_worker - INFO - End worker
This leads to a duplication of Dask log messages:
$ rcg logs -w dask-coffea-serial-kubernetes.2 | grep 'End worker'2025-01-27 13:08:50,821 - distributed.dask_worker - INFO - End worker2025-01-27 13:08:51,341 - distributed.dask_worker - INFO - End worker2025-01-27 13:08:50,821 - distributed.dask_worker - INFO - End worker2025-01-27 13:08:51,341 - distributed.dask_worker - INFO - End worker
This behaviour is a consequence of how the Dask live logs are collected and exposed by "enriching" individual job logs.
This can be troublesome for very verbose Dask analyses where the workflow contains other non-Dask workflow steps, for example a Snakemake file doing some further pre- or post- processing.
This is fortunately rare, since people relying on Dask would tend to use it for everything, but we have to be ready to tackle this problem.
Expected behaviour
The logs from various workers should not be duplicated.
For example, we could try to avoid duplication by storing logs in one job only.
However, ideally, since Dask cluster is a resource that is alive throughout the workflow execution, the logs should be collected (and exposed) separately from job logs, so that we would have a triad: Workflow logs, Job logs, and Service logs.
Since this would require a larger architectural change, this was postponed to a future refactoring.
This issue is here as an epic to remind doing this in the future.
Current behaviour
Consider reana-demo-dask-coffea analysis example with several serial steps:
That is, we have another step called "sleep" that does not do anything. However, when a user asks for job logs, one gets the previous Dask ones:
This leads to a duplication of Dask log messages:
This behaviour is a consequence of how the Dask live logs are collected and exposed by "enriching" individual job logs.
This can be troublesome for very verbose Dask analyses where the workflow contains other non-Dask workflow steps, for example a
Snakemake
file doing some further pre- or post- processing.This is fortunately rare, since people relying on Dask would tend to use it for everything, but we have to be ready to tackle this problem.
Expected behaviour
The logs from various workers should not be duplicated.
For example, we could try to avoid duplication by storing logs in one job only.
However, ideally, since Dask cluster is a resource that is alive throughout the workflow execution, the logs should be collected (and exposed) separately from job logs, so that we would have a triad: Workflow logs, Job logs, and Service logs.
Since this would require a larger architectural change, this was postponed to a future refactoring.
This issue is here as an epic to remind doing this in the future.
Notes
Stems from a past pull request review #616 (comment)
The text was updated successfully, but these errors were encountered: