-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CDC log stream state (cdc$time) persisted via connect topic connect-offsets
#16
Comments
In a single query, the connector queries all the streams that belong to a given vnode. That's why the offset is tracked by vnode_id. Does that answer your question @hartmut-co-uk ? |
thanks! How do 'generation_start' and streams relate? |
@avelanarius Could you please answer with details here? |
vnode_id
(docker-connect-offsets)?docker-connect-offsets
I've been playing with the Are there plans to add similar functionality to either this repo or scylla-cdc-java? |
docker-connect-offsets
connect-offsets
Hi, when looking at the data published to
connect-offsets
table I noticed the latest window state is tracked byWhy is this at the
vnode_id
level and where does this information come from?When querying the table the
vnode_id
is not used as a query condition, right?Further implication (maybe?):
The topic
connect-offsets
is created by kafka connect (not the scylla connector) and is not a compacted topic.While running a simple test (
scylla.query.time.window.size: 2000
) for 1 connector, 1 task, 1 table - resulted in ~1M messages on thedocker-connect-offsets
topic.@pkgonan may I ask if you've got numbers to confirm this for a more comprehensive setup?
@haaawk how is this topic consumed upon connector (re)start / task/consumer rebalancing? From beginning?
Update 2021-12-15:
ℹ️ For reference: the part on
connect-offsets
already has been well described and addressed in a section in the repo README:scylla-cdc-source-connector/README.md
Lines 601 to 605 in ecbeb1d
The text was updated successfully, but these errors were encountered: