-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: expose resetting run boundaries #112
base: main
Are you sure you want to change the base?
Conversation
IIUC this PR doesn't fix this issue? The datasets that are still mid-ingest are filtered out. I'd like to minimize non-idempotent mutations as first-class functionality in the lib. |
b875393
to
9bde0f6
Compare
@alkasm in a single file world, sure, but this is a pretty rare edge case in the long run. Consider the case where the customer has 50000 files that compose one of the datasets instead-- now "mid ingest" can still mean that the dataset shows up as "ingested" in product. Or perhaps new files get added later after the run is created. |
adcab95
to
2aac0a0
Compare
9bde0f6
to
35830ab
Compare
When creating a run, a user is required to enter in the start / end timestamps before:
As a result, customers can end up in a situation where the start / end bounds of a run aren't particularly accurate as data continues to be ingested into the platform, and having a simple way to just "reset" the bounds turns out to be powerful.