You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The iterables over documents to insert are materialized at once in the current code. What if it's a billion documents?
Also, (e.g. the vectorize path) they are materialized in consuming them and then used again later. This won't work except for Lists.
These two points need to be addressed, esp. thinking of very large amounts of documents. Batching an iterable comes to mind, (e.g. batches of 1k docs or so, each in turn doing "the usual thing" as is now (but with more care around materializing what possibly are iterables)).
The text was updated successfully, but these errors were encountered:
The iterables over documents to insert are materialized at once in the current code. What if it's a billion documents?
Also, (e.g. the vectorize path) they are materialized in consuming them and then used again later. This won't work except for Lists.
These two points need to be addressed, esp. thinking of very large amounts of documents. Batching an iterable comes to mind, (e.g. batches of 1k docs or so, each in turn doing "the usual thing" as is now (but with more care around materializing what possibly are iterables)).
The text was updated successfully, but these errors were encountered: