Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document external .next() vs internal .try_fold() iteration #70

Open
Shnatsel opened this issue Oct 13, 2023 · 0 comments
Open

Document external .next() vs internal .try_fold() iteration #70

Shnatsel opened this issue Oct 13, 2023 · 0 comments

Comments

@Shnatsel
Copy link
Contributor

This is a concept I have finally understood after writing the try_fold implementation for rust-lang/rust#116651, where looping through .next() is ~3x slower than iterating using .try_fold().

The difference is (at least in part) because .next() has to check if the buffer needs to be refilled and if an error has occurred for every single byte read, while the .try_fold() approach can do it once per chunk, then just iterate over the chunk without any additional checks.

The same is said to be true about iterating over a BTreeMap or BTreeSet, where .try_fold() iteration is faster because it can skip additional checks while looping over each chunk, although I have not tested that myself.

Internal iteration also allows better compiler optimizations in iterator chains, see rust-lang/rust#101814 (comment). While this particular issue is fixed, it is noted that external iteration is a lot less efficient than internal iteration in general. Many iterator adapters such as .for_each() are implemented in terms of .try_fold() so that they could benefit from internal iteration where available.


The bottom line is:

  1. for x in iter can be less efficient than iter.for_each() or other adapters such as .all(), .any(), .fold() etc. but only in tight loops where per-element overhead is noticeable
  2. Iterator adapters should implement .try_fold() manually by delegating to the .try_fold() of the underlying iterator where possible. (A default implementation of .try_fold() exists, but just calls .next() in a loop, losing the benefits of internal iteration). Implementing .try_fold() can be a way to create faster custom iterators too, like I've done with .bytes(). However, it is not possible to implement your own .try_fold() on stable channel because the Try trait is unstable 😿
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant