Skip to content
This repository has been archived by the owner on Oct 8, 2024. It is now read-only.

Consider making flatMap throw if the mapper returns a non-iterable #55

Closed
bakkot opened this issue Oct 11, 2019 · 64 comments · Fixed by #59
Closed

Consider making flatMap throw if the mapper returns a non-iterable #55

bakkot opened this issue Oct 11, 2019 · 64 comments · Fixed by #59
Labels
Milestone

Comments

@bakkot
Copy link
Collaborator

bakkot commented Oct 11, 2019

In Array.prototype.flatMap, if the mapping function returns a non-array, it is treated as if it returned an array of length 1. That was to preserve symmetry with .flat.

Here, I think it makes sense to be more strict (that is, to throw), for three reasons:

  1. There's no .flat to keep parity with.
  2. This version of .flatMap flattens all iterables, not just arrays. In particular, it flattens strings. I think anyone relying on the auto-boxing behavior would find that surprising, so I think we should just not have the auto-boxing behavior.
  3. Auto-boxing would mean that adding a Symbol.iterator method to any object which did not previous have it would be a breaking change.
@devsnek devsnek added this to the Stage 3 milestone Oct 11, 2019
@ljharb
Copy link
Member

ljharb commented Oct 11, 2019

Does this mean that a flatMap callback would be unable to return a non-iterable?

@bakkot
Copy link
Collaborator Author

bakkot commented Oct 11, 2019

Doing so would cause .flatMap to throw, yes.

@ljharb
Copy link
Member

ljharb commented Oct 11, 2019

That seems highly unusable - it means i always have to return an iterable, which effectively means everyone will slap [ ] around their values.

@bakkot
Copy link
Collaborator Author

bakkot commented Oct 11, 2019

If you are using .flatMap, presumably it is because you want to return iterables.

@ljharb
Copy link
Member

ljharb commented Oct 11, 2019

I use flatMap when I want to not care about whether I'm returning an iterable or not.

@michaelficarra
Copy link
Member

@ljharb How do you want strings returned by your mapping function handled?

@bathos
Copy link

bathos commented Oct 11, 2019

adding a Symbol.iterator method to any object which did not previous have it would be a breaking change

Would this not be true regardless with flatMap? (i.e. return [ objThatLaterGetsIterator ])

@bakkot
Copy link
Collaborator Author

bakkot commented Oct 11, 2019

Would this not be true regardless with flatMap?

Generally speaking we regard turning an error case into a non-error case as being not breaking, so I don't think so.

i.e. return [ objThatLaterGetsIterator ]

That would never attempt to treat objThatLaterGetsIterator as an iterator, so it wouldn't matter. flatMap doesn't recursively flatten; regardless of whether objThatLaterGetsIterator has Symbol.iterator or not, the object itself would be in the output of the resulting iterator, not its contents.

@ljharb
Copy link
Member

ljharb commented Oct 11, 2019

Doesn’t flatMap take a depth argument? I assume I’d use that to determine if i wanted my string iterator consumed or not (in which case yes, I’d wrap strings in an array, but forcing that for every other primitive seems harsh)

@bakkot
Copy link
Collaborator Author

bakkot commented Oct 11, 2019

Doesn’t flatMap take a depth argument?

No.

@ljharb
Copy link
Member

ljharb commented Oct 11, 2019

ah right, that’s only .flat - forget about that part :-)

so then yes, it’s that for strings, because they’re iterable, I’d have to wrap them in another iterable - but forcing that on all the other primitives (and non iterable objects) seems like a high price to pay.

@bakkot
Copy link
Collaborator Author

bakkot commented Oct 11, 2019

I think the main use case for flatMap is when you want to return an iterable from the mapping function - otherwise you'd just use map. So this price really does not seem that high to me, compared to the benefits listed in the OP.

forcing that on all the other primitives (and non iterable objects) seems like a high price to pay

Given that relying this fallback mechanism requires remembering which things it's going to iterate and which it is going to leave alone, I really don't think the benefit is there. Like it says in the OP, anyone who's gotten accustomed to relying on it for the other primitive types is going to be really confused by the behavior for strings. Better that they just use the consistent interface.

@devsnek
Copy link
Member

devsnek commented Oct 11, 2019

in the interest of covering all the bases, we could also special case strings

@bakkot
Copy link
Collaborator Author

bakkot commented Oct 11, 2019

I would be very strongly opposed to special casing strings.

@ExE-Boss
Copy link
Contributor

ExE-Boss commented Oct 11, 2019

This is how this works in Java 8’s streams, where the flatMap()’s functional interface argument implementation has to return a Stream instance.

This is also why Java has the static Stream.of<T>(T... values) and Stream.ofNullable<T>(T t) methods.

@zloirock
Copy link
Contributor

I'm not a fan of this. I'm for better consistency with Array.prototype.flatMap.

@zloirock
Copy link
Contributor

In the future, to iterators prototype could be added .flat and we will lose the symmetry.

@Jamesernator
Copy link

Jamesernator commented Oct 24, 2019

I'm not a fan of this. I'm for better consistency with Array.prototype.flatMap.

Note that it was explicitly decided against flattening iterables because they weren't Arrays. Similarly tterables aren't iterators either so in my opinion the array thing isn't a strong argument.

Instead it could be an option to just flatten other Iterators (not Iterables).

Iterator.from([1,2,3,4])
  .flatMap(i => i)
  .toArray(); // [1,2,3,4]

Iterator.from([1,2,3,4])
  .flatMap(i => [i, i]);
  .toArray(); // [[1,1], [2,2], [3,3], [4,4]]

Iterator.from([1,2,3,4])
  .flatMap(i => Iterator.from([1,1])) // Iterator flattening
  .toArray(); // [1,1,2,2,3,3,4,4]

@zloirock
Copy link
Contributor

@Jamesernator for better, but not for full -)

The maximum consistency of methods of different classes is a good practice for languages and minimizes language learning problems.

Some difference between methods makes sense, like usage iterables in flatMap or only one argument of callbacks of methods with a callback.

Some - just makes more complexity and make language learning more difficult, like removing optional thisArg of methods with callbacks (legacy, but could not cause any problems) or this proposal.

@bergus
Copy link

bergus commented Oct 28, 2019

@ljharb

I use flatMap when I want to not care about whether I'm returning an iterable or not.

You might come from a different mindset, but that's really not how programming should work. Relying on dynamic types makes code hard to maintain and implementations hard to optimise. We suffered from this problem with the "promise or thenable or maybe not and just a plain value instead" return value of the promise then method's callback - this is the kind of overloading that leads to unexpected errors (objects with then methods there, iterable strings here) and made promises harder to understand for many learners. Please don't make the same mistake again.

One should use map when one does not want the values to get unwrapped, signalling to the reader that the callback might return any value.
One should use flatMap when one does want the values to get unwrapped, signalling to the reader that the callback will return an iterable.

If one does not care and wants the code to work with both, one should explicitly call a wrapIfNotIterable function. (And a wrapIfNotIterableOrString could deal with the string problem). This is not the job of flatMap - if it was, it would be called flatMapWherePossible.

it means i always have to return an iterable, which effectively means everyone will slap [ ] around their values.

I'm not sure where you'd need that - most of those cases should probably be using map instead. If you are thinking of callbacks that might return either an iterable or just a plain value, like

function filterValuesByKey(iterator, predicate) {
     return iterator.flatMap(([key, value]) => predicate(key) ? value : [])
//                   ^^^^^^^ assuming the overloaded version
}
new Set(filterValuesByKey(map.entries(), ))

then I would argue that predicate(key) ? [value] : [] is actually the right way to do that, as the version without [ ] is broken: it flattens iterable values, which is not desirable.

@devsnek
Copy link
Member

devsnek commented Oct 28, 2019

One should use map when one does not want the values to get unwrapped, signalling to the reader that the callback might return any value.
One should use flatMap when one does want the values to get unwrapped, signalling to the reader that the callback will return an iterable.

This is very motivating to me. However, we do still have Array#flatMap which behaves differently.

@bergus
Copy link

bergus commented Oct 28, 2019

@devsnek Tbh I have exactly the same concerns about Array.prototype.flatMap, only there it is already too late to fix.

@ExE-Boss
Copy link
Contributor

ExE-Boss commented Oct 28, 2019

@bergus

function filterValuesByKey(iterator, predicate) {
	return iterator.flatMap(([key, value]) => predicate(key) ? value : [])
}

If you don’t want value to be iterated, then instead of:

function filterValuesByKey(iterator, predicate) {
	return iterator.flatMap(([key, value]) => predicate(key) ? [value] : [])
}

You’ll want the following, which is more performant, because it doesn’t allocate extra arrays, which is why this proposal is being made in the first place:

function filterValuesByKey(iterator, predicate) {
	return iterator.filter(([key /*, value*/]) => predicate(key))
}

@zloirock
Copy link
Contributor

zloirock commented Oct 28, 2019

It's not just wrapping of something to array literal, it's also 2 useless microtasks (count the number of created promises) for async version on each element and not cheap internal logic for unwrapping. For example, flatten case:
image

@bergus
Copy link

bergus commented Oct 28, 2019

@ExE-Boss Yes, one could write iterator.filter(kv => predicate(kv[0])).map(kv => kv[1]) as well and avoid flatMap altogther, however this use case is one of the more common ones for using flatMap with returning array literals. (Related are mapMaybe in Haskell, filterMap in Rust, Elm, Purescript, flatMap/etc itself in Scala and other languages that have iterable Option types). It's just the simplest example I could come up with - see this post I found on the web for a little variation. Maybe @ljharb was thinking of something different anyway.

@zloirock As I said, it's more about correctness. AsyncIterator.from(["12", [3, "4"], 5, "67"]).flatMap(it => it) does not give the desired results when you only want to flatten arrays, especially arrays whose element type you do not know (in a library function). That matters much more than how many microtask an async iterator needs or how easy array literals are to inline for an engine.

@zloirock
Copy link
Contributor

zloirock commented Oct 28, 2019

@bakkot you can write a rule which enforces [] to .flatMap callback result if you need it. You suggest a strange example - the web platform will not add Symbol.iterator to something if it can break the web. Moreover - why?

@bergus
Copy link

bergus commented Oct 28, 2019

@zloirock It would need a typechecker, not a linter, to recognize the need.
(Btw, typechecking is another argument in itself: an overloaded flatMap callback return type not only doesn't let a programmer infer types properly, but also any typechecking algorithm is hindered by this).

@zloirock
Copy link
Contributor

@bergus see above.

@bakkot
Copy link
Collaborator Author

bakkot commented Oct 28, 2019

@zloirock The web platform will generally try to avoid shipping breaking changes, yes. Which means that if we add this feature, and people start using it, they will never be able to add Symbol.iterator to something which currently lacks it. Which is bad.

@zloirock
Copy link
Contributor

@bakkot your example too far-fetched, so don't worry about it -)

@bakkot
Copy link
Collaborator Author

bakkot commented Oct 29, 2019

@zloirock I don't think it's far-fetched. It is reasonable to make things iterable. For example, the Streams spec added Symbol.asyncIterator to ReadableStream just this year, and it hasn't yet shipped in browsers, despite ReadableStream shipping for years.

@bergus
Copy link

bergus commented Oct 29, 2019

@zloirock As another example, consider recursively flatMapping a DOM tree to an iterator of node lists. Remember when NodeList became iterable and got its forEach method?

@zloirock
Copy link
Contributor

zloirock commented Oct 29, 2019

@bergus ...and? In cases like that, on objects, sure, better wrap the result to something iterable, like an array, but why should I wrap to array primitives? Why should I write a callback several times longer and sacrifice performance? Why should I remember one more difference between this and Array method?

@zloirock
Copy link
Contributor

zloirock commented Oct 29, 2019

On Array method we have one semantic. On this method, you propose another. On collections method we will have third. On another abstraction - fourth... Hail JS!

@Jamesernator
Copy link

Jamesernator commented Oct 29, 2019

I'd still like to point out the option to flatten other Iterators only:

e.g.

function* flatMap(mapperFn) {
  for (const value of this) {
    const mapped = mapperFn(value);
    if (hasSlot(mapped, [[Iterated]])) {
      // flatten mapped
    } else {
      yield mapped;
    }
  }
}
instead of
function* flatMap(mapperFn) {
  for (const value of this) {
    const mapped = mapperFn(value);
    if (mapped[Symbol.iterator]) {
      // flatten mapped
    } else {
      yield mapped;
    }
  }
}

This gives fallback behavior for non-iterators, avoids the no-upgrade to Iterable hazard as we don't flatten them, and is similarly symmetric with Array (.flatMap only flattens the same type, if you want to flatten an iterable you can use Iterator.from/.values()/[Symbol.iterator]()/etc).

@devsnek
Copy link
Member

devsnek commented Oct 29, 2019

Assuming you mean iterator records ({ [[Iterator]], [[NextMethod]], [[Done]] }), those are never exposed to js code.

@Jamesernator
Copy link

Jamesernator commented Oct 29, 2019

Assuming you mean iterator records ({ [[Iterator]], [[NextMethod]], [[Done]] }), those are never exposed to js code.

Looks like I was assuming [[Iterated]] was something on all Iterator instances.

Edited pseudocode:

function* flatMap(mapperFn) {
  for (const value of this) {
    const mapped = mapperFn(value);
    if (mappped instanceof Iterator) { // NOT checking for Iterable
      // flatten mapped
    } else {
      yield mapped;
    }
  }
}

@devsnek
Copy link
Member

devsnek commented Oct 29, 2019

Being an instance of Iterator is not a requirement of the iterator protocol, so we shouldn't use that as a check.

@bergus
Copy link

bergus commented Oct 30, 2019

@Jamesernator I think it has been mostly agreed about in the discussion #47 (comment) that flattening iterables is much more useful than flattening iterators. But regardless which protocol we check for (by testing the existence of a next method for iterators, or a Symbol.iterator method for iterables), this issue is discussing what should happen when the check fails: should it throw an exception or should it yield the wrong value from the resulting iterator?

@ExE-Boss
Copy link
Contributor

I think it should throw an exception, line any sane programming language would.

@ljharb
Copy link
Member

ljharb commented Oct 30, 2019

Let’s avoid terms like “sane”, please.

@Jamesernator
Copy link

I think it has been mostly agreed about in the discussion #47 (comment) that flattening iterables is much more useful than flattening iterators.

The same was argued for Array.prototype.flatMap.

And if we agree operating on iterables is strictly more useful I'm still not sure why this proposal isn't for operating on iterables instead.

this issue is discussing what should happen when the check fails: should it throw an exception or should it yield the wrong value from the resulting iterator?

I think we should avoid discussing specific features and instead focus on the problems. The reason I came up with Iterator-only flattening was so that there'd be a middle ground between not having the upgrade-to-iterable hazard while also providing a solution for people who want to mix.

@bergus
Copy link

bergus commented Nov 1, 2019

@Jamesernator If we did iterator-flattening instead of iterable-flattening, the issue would be an upgrade-to-iterator hazard instead. (Which probably might happen even more accidentally, as people don't realise a next method implements the Iterator interface).
What we really want here is to prevent people from mixing non-iterable/non-iterator values in the flatting. It's just unsafe, and unnecessary for most use case anyway.

@conartist6
Copy link

conartist6 commented Nov 4, 2019

My background: I maintain iter-tools. My thoughts:

I definitely think flatMap should throw on non-iterables, for all the reasons described here. I don't think that as @zloirock suggests this will mean that people wrap [...iterable] around their flatMap callback returns. It just means that plain iterators which are not also flagged as iterables using [Symbol.iterator]() { return this } are nearly useless because lacking a (safe) means to detect them reflectively the programmer must just know what they have. I'm presuming that the iterators returned by methods like flatMap will have [Symbol.iterator]() { return this }, as all generators do. If all generators, iterators transformed from generators, native data types, and iterators returned from libraries like iter-tools, itertools, ix, Immutable.Seq, etc, are actually iterables, I don't see where these naked iterators that people are going to wrap with Arrays are really going to be originating. They shouldn't exist.

If I'm being really honest, I don't even think iterator should be a separate protocol. It's just needlessly confusing. It is in essence a subpart of the iterable protocol, which interestingly does not specify whether separate iterators returned from Symbol.iterator() do or do not share state, meaning that you must always assume that they do (that all iterables are merely iterators) unless you as the programmer surely know otherwise.

I also would strongly support special casing string. Nothing is gained by turning a string (an iterable of length one strings) into an iterable of length one strings, save throwing away irrevocably the fact that the iterable contains only characters and is safe to use with string APIs. In fact it isn't, and even if you did know you had an iterable of characters, there is no API for turning an iterable of characters back into a string. Finally any kind of optimization that the javascript engine had for the string is lost. Scratch that, because this is flatMap not flat you just shouldn't return a string from your callback and you're fine.

@Jamesernator
Copy link

Jamesernator commented Nov 5, 2019

If we did iterator-flattening instead of iterable-flattening, the issue would be an upgrade-to-iterator hazard instead.

Note I am specifically suggesting only flattening those that inherit Iterator. This seems fine given you need to wrap with Iterator.from for iterators that don't inherit Iterator anyway to even get the methods.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.