Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chunker: infinite loop in the Rabin chunker #353

Open
MichaelMure opened this issue Dec 16, 2019 · 1 comment
Open

chunker: infinite loop in the Rabin chunker #353

MichaelMure opened this issue Dec 16, 2019 · 1 comment
Assignees

Comments

@MichaelMure
Copy link
Contributor

MichaelMure commented Dec 16, 2019

At Infura we got some of our nodes ending in an infinite loop in the Rabin chunker, with the CPU at 100% until something bad happen (the context doesn't reach there so loop keep spinning even though the request is long gone).

Here is a profiling:
profile001

Original requests are add with the following parameters: [chunker=rabin-262144-524288-1048576,encoding=json,hash=sha2-256,inline-limit=32,pin=false,progress=true,stream-channels=true,]. Some also have trickle=true but that doesn't seem to matter.

Profiling lead to an infinite loop in https://github.com/whyrusleeping/chunker/blob/master/chunker.go#L209-L336 but it's unclear to me what is happening. if err == io.ErrUnexpectedEOF { err = nil } is suspicious but that kinda fly over my head.

@MichaelMure
Copy link
Contributor Author

Btw, this is with go-ipfs 0.4.22 so v0.0.1 of this package, but that's the same github.com/whyrusleeping/chunker as master.

@ribasushi
Copy link
Contributor

@MichaelMure are you by chance able to share an input ( publicly or privately ) that reliably reproduces this? Or is the problem intermittent and the same input might hang a daemon once, and then pass through just fine on rerun?

@MichaelMure
Copy link
Contributor Author

Unfortunately, that's all the information I have. It seems to be fairly rare, but to be fair, not many people play with the Rabin chunker.

@ribasushi
Copy link
Contributor

@MichaelMure understood. It definitely sounds like a missing underflow protection, but it will take me another couple days to get my test setup working well to pin this down. I should have something for you Thu-ish

@Stebalien
Copy link
Member

Note: our current rabin implementation is terrible. I wouldn't waste too much time trying to fix it.

The buzhash implementation in go-ipfs master should work much better.

@MichaelMure
Copy link
Contributor Author

Will it get deprecated/dropped ? Because if someone can take out our nodes with a bad request, that's not so fun.

@Stebalien
Copy link
Member

Probably not? It's just a low-priority issue given that we support an additional chunker and are working on yet another chunker.

If you need it fixed now, I'm happy to accept a patch. However, the HTTP API is not designed to be a public API so this issue is significantly less critical for the IPFS team.

@MichaelMure
Copy link
Contributor Author

I understand the low priority. My question is more if you plan to retire Rabin once Buzhash is there for good.

@ribasushi
Copy link
Contributor

@MichaelMure there's a sustained effort over the next ~month to better quantify what exactly do we want from a CDC, and where our actual bottlenecks are. Please track the main text of ipfs/specs#227 (comment) for further updates.

@ribasushi
Copy link
Contributor

@MichaelMure just a quick update on this ticket: this is almost certainly a ( but not 100% proven ) bug in the way the rabin chunker implementation is reading from the stream, resulting in an unexpected under-read and thus a loop for data that never comes.

Since this issue got opened originally, we have developed a different subsystem providing a unified read-interface for the rest of the ingestion pipeline, namely https://pkg.go.dev/github.com/ipfs/[email protected]?tab=doc#pkg-overview

This bug will be solved in the process of migrating to that system.

@MichaelMure
Copy link
Contributor Author

Awesome thanks for the update :)

@welcome
Copy link

welcome bot commented Jun 16, 2023

Thank you for submitting your first issue to this repository! A maintainer will be here shortly to triage and review.
In the meantime, please double-check that you have provided all the necessary information to make this process easy! Any information that can help save additional round trips is useful! We currently aim to give initial feedback within two business days. If this does not happen, feel free to leave a comment.
Please keep an eye on how this issue will be labeled, as labels give an overview of priorities, assignments and additional actions requested by the maintainers:

  • "Priority" labels will show how urgent this is for the team.
  • "Status" labels will show if this is ready to be worked on, blocked, or in progress.
  • "Need" labels will indicate if additional input or analysis is required.

Finally, remember to use https://discuss.ipfs.io if you just need general support.

@hacdias hacdias transferred this issue from ipfs/go-ipfs-chunker Jun 16, 2023
@hacdias hacdias changed the title infinite loop in the Rabin chunker chunker: infinite loop in the Rabin chunker Jun 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Archived in project
Development

No branches or pull requests

3 participants