-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parallel pyiron table #1050
Parallel pyiron table #1050
Conversation
@@ -432,7 +420,7 @@ def _collect_job_update_lst(self, job_status_list, job_stored_ids=None): | |||
and job.status in job_status_list | |||
and self.filter_function(job) | |||
): | |||
job_update_lst.append(job) | |||
job_update_lst.append(job_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the tricky part. To apply the filter_function
the job is already loaded in inspect mode, but as the job cannot be communicated to the subprocess, it has to be loaded again inside the subprocess.
|
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
|
@ligerzero-ai Does this help your needs for a parallel pyiron table version? |
This is great - I've got a separate contribution opening up as a draft in contrib later tonight. That one is completely individualised outside of the pyiron ecosystem. I am hoping that it will allow users to create dataframes for ML potentials easily (a. la. TrainingContainer with a little bit of fiddling) I will ping you there. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
# Conflicts: # .ci_support/environment.yml # setup.py
Based on the performance analysis from @ligerzero-ai in https://github.com/orgs/pyiron/discussions/211#discussioncomment-8034046 and the general implementation of adding support for executors in #1155 - I removed the direct dependence on |
On hand I like that one can customize the executor. On the other hand having to pass the instance in is a bit clunky and won't work anymore once I submit a table to the queue (or does it somehow?). I think it's ok leave the option to pass the instance in, but it would be nice if the table just creates a |
The issue here is similar to what we discuss in #1296 . Basically, we can have an executor attached to a single job object, in that case the job object is submitted to the executor and executed on one of the workers of the executor. Still in the case of jobs with multiple job objects inside them, like GenericMaster jobs or alternatively with pyiron tables which have multiple function calls inside them we want assign a single executor which is then used to execute the individual tasks within this job. |
Currently the initialisation for the individual subprocesses is too expensive, so it does not make a lot of sense to focus on a parallel pyiron table implementation. Still it is maybe a good point to start. Example code: