Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gotcha: hasNext() modifies state #146

Open
IanMayo opened this issue Apr 5, 2017 · 3 comments
Open

Gotcha: hasNext() modifies state #146

IanMayo opened this issue Apr 5, 2017 · 3 comments

Comments

@IanMayo
Copy link
Contributor

IanMayo commented Apr 5, 2017

Ooh, this just caught me out.

In Java the iterator.hasNext() method is always a pure function: it doesn't modify state. So, until iterator.next() is called, it will always return the same value.

However, in January iterator.hasNext() is used to advance the iterator, since the actual value is returned from a public int value.

Maybe the above strategy has been introduced as a performance measure. Does this outweigh the counter-intuitive implementation of iterator.hasNext()?

I propose that January iterators follow the Java standard of hasNext() for loop control and next() to both retrieve the current value and progress the iterator.

@PeterC-DLS
Copy link
Member

Yes, IndexIterator was unwisely called an iterator but does not behave like Iterator<?>. As you guessed correctly, it is for performance to avoid boxing and extra calls (for access to it.index or it.getPos()).

Add Java 8 functional support would hopefully give a way for future users to avoid having to learn these quirks. See #118.

@IanMayo
Copy link
Contributor Author

IanMayo commented Apr 5, 2017

Traditional coding patterns

Even if IndexIterator wasn't named as an iterator, the method name hasNext() would have thrown me.

I accept that the extensive use of these iterators in DAWN may unfortunately make it too resource-expensive to switch to the traditional model of Java iterator. This could be overcome by renaming hasNext() to something more explicit (hasNextAfterAdvance()).

But, my personal favourite would be to reflect the traditional iterator pattern. I'd also be prepared to invest effort to support the change.

Performance

I'm not a performance expert, but I'll throw in my tuppence-worth...

In moving through a collection we have to:

  1. check for more items
  2. advance to the next item
  3. retrieve the current value

Items 1. and 2. include processing.

The January iterator combines operations 1 and 2. The java iterator combines operations 2 and 3. No more or less processing seems to occur in either implementation.

@PeterC-DLS
Copy link
Member

Renaming the classes would avoid future confusion so there would be no expectations on how it works.

As you note, these 'iterators' do not offer the contents of the datasets (only the index or position within a dataset). Mainly reduction tasks would benefit from a traditional iterator but this pattern is not so useful for a lot of dataset operations.

There is no reason you could not write a proper iterator that wraps the datasets...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants