Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GetNetworkStats improvment: timeout & caching #1327

Merged

Conversation

bchamagne
Copy link
Contributor

Description

The GetNetworkStats message is called by many nodes at the same time, the computation it does is heavy. I added a JobCache so the computation is only done once (cache reset after 30s).
I also added a 5s timer which is bigger than the 3s default.
Fixes #1325

Type of change

  • Bug fix (non-breaking change which fixes an issue)

How Has This Been Tested?

This can only be tested in stress condition. I'll setup some test environment tomorrow.

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes
  • Any dependent changes have been merged and published in downstream modules

@bchamagne bchamagne added bug Something isn't working beacon chain Involve BeaconChain enhancements labels Nov 16, 2023
@bchamagne bchamagne added this to the 1.5.0 milestone Nov 30, 2023
Copy link
Member

@Neylix Neylix left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can have a better way to handle this heavy calculation process.
The StatsCollector module is already doing one part of the network stats calculation by requesting only once the stats for all subsets.
We could reuse this module to also calculate only once the network patch for all the GetNetworkStats message a node receive.

Also those calculation/request are done only by the beacon summary node at summary time (00:00 currenlty) and are usefull before the self repair time (00:05 currently)

So we could use the StatsCollector module as an orchestrator that will start the 2 task (request/calculation) at summary time (using PubSub event) and delegate the calculation to JobCache module. Then store all client that request an information and return the result from the JobCache. Then when the self repair time is triggered (using Pubsub event) the module can clean the JobCache as the result will not be relevant anymore.

lib/archethic/beacon_chain/network_coordinates.ex Outdated Show resolved Hide resolved
@bchamagne bchamagne modified the milestones: 1.5.0, 1.4.3 Dec 22, 2023
bchamagne added 2 commits December 26, 2023 16:49
- it now use a registry
- it might start the process if not already started
@bchamagne bchamagne merged commit d701cbd into archethic-foundation:develop Dec 26, 2023
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
beacon chain Involve BeaconChain bug Something isn't working enhancements
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Network state retrieval falls in timeout
2 participants