Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure that Task SDK supervisor closes all its subprocess handles correctly. #44263

Merged
merged 1 commit into from
Nov 22, 2024

Conversation

ashb
Copy link
Member

@ashb ashb commented Nov 21, 2024

If we keep any copies of the handles open then the selector loop in monitor
process will get stuck waiting on a read loop on a socket that never closes
(because its open in the same process still!).

Sadly I wasn't able to reproduce this behaviour in unit tests, but only when
running this code for real. (The main fix here is to pass child_stderr to
_close_unused_sockets too, the rest are drive-by tidy ups)

Since we never access proc.stdout or stderr on the class (they are closed
over in the callbacks only) I removed those properties as they aren't needed
and shouldn't be accessed directly as it would lead to garbled output.

Also add tests (and re-fix) the last-chance exception handling


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@ashb ashb added the area:task-execution-interface-aip72 AIP-72: Task Execution Interface (TEI) aka Task SDK label Nov 21, 2024
@ashb ashb force-pushed the fix-tasksdk-subprocess-closing branch from 8ceb4dd to 3cca463 Compare November 21, 2024 18:34
@ashb ashb changed the title Ensure that Task SDK supervisor closes all its handles correctly. Ensure that Task SDK supervisor closes all its subprocess handles correctly. Nov 21, 2024
task_sdk/src/airflow/sdk/execution_time/supervisor.py Outdated Show resolved Hide resolved
task_sdk/src/airflow/sdk/execution_time/supervisor.py Outdated Show resolved Hide resolved
If we don't we keep any copies of it open then the selector loop in monitor
process will get stuck waiting on a read loop.

Sadly I wasn't able to reproduce this behaviour in unit tests, but only when
running this code for real. (The main fix here is to pass `child_stderr` to
_close_unused_sockets too, the rest are drive-by tidy ups)

Since we never access `proc.stdout` or stderr on the class (they are closed
over in the callbacks only) I removed those properties as they aren't needed
and shouldn't be accessed directly as it would lead to garbled output.

Also add tests (and re-fix) the last-chance exception handling
@ashb ashb force-pushed the fix-tasksdk-subprocess-closing branch from e07dd99 to 2b14e69 Compare November 22, 2024 09:36
@ashb ashb merged commit 6c3b6b1 into main Nov 22, 2024
42 checks passed
@ashb ashb deleted the fix-tasksdk-subprocess-closing branch November 22, 2024 10:15
Copy link
Contributor

@amoghrajesh amoghrajesh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:task-execution-interface-aip72 AIP-72: Task Execution Interface (TEI) aka Task SDK area:task-sdk
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants