Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: process pool's fork safety issue since macOS High Sierra #411

Conversation

tianjianjiang
Copy link
Contributor

Long story short: if not replacing Pool with ThreadPool, 99% of the time the helicopter view will hang on recent macOS.

Code review comments will provide some details.

@@ -103,11 +105,11 @@ def show_text(t, width=WIDTH):
all_infos = manager.dict()
all_datasets = list(set([t[0] for t in template_collection.keys]))

def get_infos(d_name):
def get_infos(d_name, all_infos):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't mandatory "here" but better be. Normally a function must be defined before Pool init. However, in this case it is almost impossible, because the Pool context is managed by streamlit. Assuming we actually can control the context, then the Pool will encounter AttributeError for the implicit/global variable all_infos.

all_infos[d_name] = get_dataset_infos(d_name)

pool = Pool(processes=len(all_datasets))
pool.map(get_infos, all_datasets)
pool = ThreadPool(processes=len(all_datasets))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A safer choice on recent macOS is to replace fork with spawn. Unfortunately, we can't really change Pool's start method to spawn without breaking streamlit. So the only convenient choice is ThreadPool (which is also good for I/O bound in this case).

@stephenbach
Copy link
Member

Hey @tianjianjiang! Thanks for working on this! Shall we revisit now that the paper submission has passed? If you can resolve conflicts, I can review. Thanks!

@tianjianjiang
Copy link
Contributor Author

Hey @tianjianjiang! Thanks for working on this! Shall we revisit now that the paper submission has passed? If you can resolve conflicts, I can review. Thanks!

@stephenbach You're welcome and sure! I will see what I can do ASAP.

@tianjianjiang tianjianjiang force-pushed the fix-process_fork_safety_since_macos_high_sierra branch 2 times, most recently from 7a8af0d to f649fc7 Compare October 12, 2021 01:39
@tianjianjiang
Copy link
Contributor Author

tianjianjiang commented Oct 12, 2021

Hey @tianjianjiang! Thanks for working on this! Shall we revisit now that the paper submission has passed? If you can resolve conflicts, I can review. Thanks!

@stephenbach You're welcome and sure! I will see what I can do ASAP.

One of obstacles for python>=3.8 that I forgot: #515 (comment) ('NoneType' object has no attribute 'session_id')

@stephenbach
Copy link
Member

Ok, we're downgrading to Python 3.7 in #547. But ok if this doesn't get into the first release. We're pinning tonight (Eastern US time).

@tianjianjiang tianjianjiang force-pushed the fix-process_fork_safety_since_macos_high_sierra branch from f649fc7 to 399428c Compare November 30, 2021 18:54
@tianjianjiang
Copy link
Contributor Author

Ok, we're downgrading to Python 3.7 in #547. But ok if this doesn't get into the first release. We're pinning tonight (Eastern US time).

Not a pressing issue, but a side note: I realize that even when using python 3.7.12, the fork errors sometimes still happen. So I use what this PR does locally.

@tianjianjiang tianjianjiang force-pushed the fix-process_fork_safety_since_macos_high_sierra branch from 399428c to e4e5ca9 Compare December 11, 2021 08:04
@tianjianjiang tianjianjiang force-pushed the fix-process_fork_safety_since_macos_high_sierra branch from e4e5ca9 to a000d38 Compare December 15, 2021 05:44
@tianjianjiang
Copy link
Contributor Author

tianjianjiang commented Apr 2, 2022

Close this in favor of #733.

Please kindly note that #733 still uses multi-process instead of multi-thread, which may still be an issue if we want to upgrade streamlit.

@tianjianjiang tianjianjiang deleted the fix-process_fork_safety_since_macos_high_sierra branch April 2, 2022 22:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

2 participants