Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a mode for initial ingestion of event data #350

Open
yruslan opened this issue Feb 2, 2024 · 0 comments
Open

Add a mode for initial ingestion of event data #350

yruslan opened this issue Feb 2, 2024 · 0 comments
Labels
DE enhancement New feature or request Pramen-Scala

Comments

@yruslan
Copy link
Collaborator

yruslan commented Feb 2, 2024

Background

When an event table is ingested initially, the history can be quite long.
Executing each event date one by one can have a big overhead and execute many queries against the source database.

Feature

Add a mode for initial ingestion of event data.

Example

TBD

Proposed Solution

What we can do is load full table in one go to a temporary directory (cache), and then run pipeline tasks for each date found in the database.

@yruslan yruslan added enhancement New feature or request Pramen-Scala DE labels Feb 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
DE enhancement New feature or request Pramen-Scala
Projects
None yet
Development

No branches or pull requests

1 participant