Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make ayon-shotgrid services multi-threaded and asynchronous #140

Open
2 tasks done
robin-ynput opened this issue Oct 9, 2024 · 4 comments
Open
2 tasks done

Make ayon-shotgrid services multi-threaded and asynchronous #140

robin-ynput opened this issue Oct 9, 2024 · 4 comments
Assignees
Labels
type: enhancement Improvement of existing functionality or minor addition

Comments

@robin-ynput
Copy link
Collaborator

robin-ynput commented Oct 9, 2024

Is there an existing issue for this?

  • I have searched the existing issues.

Please describe the feature you have in mind and explain what the current shortcomings are?

Our current services implementation is quite monobloc with a big while True that does everything at once.
This is functional but in my opinion this is a bit fragile and hard to test.

We could make all of those process asynchronous by reworking a bit of the initial design there.

Stability gains:

  • Currently if an intermediary crashes then everything goes down and all events in the batch are lost, with this design we can imagine, pause and recover features.
  • Currently the whole performance is driven by the slower bricks, splitting into dedicated thread might also leads the way to parallelization later on
  • Each business "task" will have a small dedicated scope, this will be easiest to furnish unit tests and extend/re-use.

How would you imagine the implementation of the feature?

Quick example for leecher service (very quick thoughts, my be ending slightly different):

  • split gathering FLOW event process in its own thread
  • split event AYON translator process in its own thread
  • introduce a queue between those 2 threads

image

Are there any labels you wish to add?

  • I have added the relevant labels to the enhancement request.

Describe alternatives you've considered:

No response

Additional context:

No response

@robin-ynput robin-ynput added the type: enhancement Improvement of existing functionality or minor addition label Oct 9, 2024
@robin-ynput robin-ynput self-assigned this Oct 9, 2024
@robin-ynput robin-ynput changed the title Make ayon-shotgrid services multi-threaded and aynchronous Make ayon-shotgrid services multi-threaded and asynchronous Oct 9, 2024
@jakubjezek001
Copy link
Member

Popping @iLLiCiTiT and @dee-ynput. This would be great to implement but not sure about available resources.

@iLLiCiTiT
Copy link
Member

iLLiCiTiT commented Dec 4, 2024

I'm not sure I do fully understand? If I get it correctly, you want to merge leecher and processor into one service?

If yes, then there might be missing infromation why we do it the way we do now. We do split it so we can have option to run multiple leecher and processor services at once, and you can restart/change version of the services without loosing track about progress.

  • Leecher gets shotgrid events and store then to AYON database. If multiple leechers do try to store the same event it will happen only first time because they would have same event hash which is not allowed. On start it is able to find last leeched event and continue from that moment, so if you restart the service (update to new version) it looks like nothing really changed.
  • Processor does enroll leeched events, only one leeched event can be enrolled and no other processor can enroll another one until the previous is marked as failed or finished. So we don't start to process events in different order than they happened, and that kinda do store progress about which events were processed and which were not.

I don't think asyncio would add any benefit, the logic is straight and sequential, but sure, can be used.

@robin-ynput
Copy link
Collaborator Author

robin-ynput commented Dec 4, 2024

Sorry I was not clear enough, my proposal is not to merge leecher and processor together rather then tweak both of their main loops to use threads exchanging info through a multiprocessing.Queue.

For the leecher :

  • one thread would query SG periodically and gather events to a queue
  • another one would consume the queue, triage the event and send it to AYON if needed

For the processor:

  • one thread would pool event at the sg_polling_frequency and fill up the queue
  • another one would consume that event, triage it and trigger ayon_api accordingly

This logic would mean we implement smaller purpose-scoped functions that can be tested invividually, and makes it easier to debug/prepare parallelism performance-wise. AYON server performance does not impact SG autodesk query and the other-way around is also true.

TBH I'm not sure yet if this is convoluted for our use-case or a good first step adopting a more event-driven approach in the future.
This is likely a whole different dicussion but I got the feeling we are using AYON DB right now as an event broker.

@iLLiCiTiT
Copy link
Member

iLLiCiTiT commented Dec 5, 2024

one thread would query SG periodically and gather events to a queue
another one would consume the queue, triage the event and send it to AYON if needed

We don't have the filtering part, but even if we would have it, I don't think that has to happen in a queue.

one thread would pool event at the sg_polling_frequency and fill up the queue
another one would consume that event, triage it and trigger ayon_api accordingly

That's what enroll does. It does pull latest unprocessed event, creates dependent event (a job) to avoid other processors (if there are other) to process other events. The queue happens server side automatically, so if you have 2 processors, they don't process the same events multiple times, or don't process them in wrong order.

but I got the feeling we are using AYON DB right now as an event broker.

We're, to be able run multiple processor services at a time without breaking anything. And being able to freely restart/update them without loosing track about progress.

(Explaining why it is as it is now.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: enhancement Improvement of existing functionality or minor addition
Projects
None yet
Development

No branches or pull requests

3 participants