Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(pin_api): fetch chunks in parallel #4428

Merged
merged 7 commits into from
Oct 26, 2023
Merged

perf(pin_api): fetch chunks in parallel #4428

merged 7 commits into from
Oct 26, 2023

Conversation

istae
Copy link
Member

@istae istae commented Oct 26, 2023

Checklist

  • I have read the coding guide.
  • My change requires a documentation update, and I have done it.
  • I have added tests to cover my changes.
  • I have filled out the description and linked the related issues.

Description

closes #4427

Open API Spec Version Changes (if applicable)

Motivation and Context (Optional)

Related Issue (Optional)

Screenshots (if appropriate):

@istae istae requested review from mrekucci and acha-bill October 26, 2023 12:57
sem := semaphore.NewWeighted(100)
var errTraverse error
var mtxErr sync.Mutex
var wg sync.WaitGroup
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can use "golang.org/x/sync/errgroup" instead and get rid of the mutex and error.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i thought about using that but I still need access to the internal error inside to goroutine to stop the traversal.

@istae istae merged commit d324646 into master Oct 26, 2023
10 of 11 checks passed
@istae istae deleted the parallel-post-pin branch October 26, 2023 13:32
@ldeffenb
Copy link
Collaborator

ldeffenb commented Oct 26, 2023

I shudder to think what would happen to a bee node that attempts to pin a huge manifest like the OSM data set. It will kick off go-routines after go-routines, until it likely runs out of memory and crashes. I believe there should be an upper limit on how many concurrent fetches are running and throttle the concurrency via a queue or something.

Just take a sacrificial node on the mainnet and try to pin:
ac28eaf69a6415a243d564ea69765ed8ee81faefc8b2e0b775fc54dbc32d8fa9

@istae
Copy link
Member Author

istae commented Oct 26, 2023

it's capped at 100 parallel retrieval requests

@ldeffenb
Copy link
Collaborator

Ah, I see the weighted semaphore now! That's what I get for commenting before actually studying, rather than skimming, the code!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

POST /pins/ref should fetch and store chunks in parallel
3 participants