Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Xmin rep #747

Merged
merged 17 commits into from
Dec 5, 2023
Merged

Xmin rep #747

merged 17 commits into from
Dec 5, 2023

Conversation

serprex
Copy link
Contributor

@serprex serprex commented Dec 4, 2023

QRep based xmin replication has a fault: xmin isn't monotonic & has wraparound issues

Wraparound workaround: grab all records with 0 < age(xmin) <= age(last snapshot xmin) which'll include some records from before wraparound but so it goes

Logic is copied from qrep code

@@ -185,6 +185,7 @@ func (s *SnapshotFlowExecution) cloneTable(
},
}

// TODO handle xmin? if yes, maybe move xmin redirection to QRepFlowWorkflow
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO I don't know what snapshot_flow is for

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is initial load for CDC. XMIN shouldn't be relevant here

bufferSize := shared.FetchAndChannelSize
var wg sync.WaitGroup

var goroutineErr error = nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code in question seems like it shouldn't progress when an error is caught earlier on

@iskakaushik
Copy link
Contributor

0 <= age(xmin) < lastbatchtxid

@serprex should'nt this be 0 <= age(xmin) < age(lastbatchtxid) ?

@serprex
Copy link
Contributor Author

serprex commented Dec 4, 2023

0 <= age(xmin) < lastbatchtxid

@serprex should'nt this be 0 <= age(xmin) < age(lastbatchtxid) ?

Yes, which is what's in the code. Fixed description

@serprex
Copy link
Contributor Author

serprex commented Dec 5, 2023

Potential followups:

  1. include configurable wait time between scans for small tables
  2. be able to have most scans be since last current txid (like original implementation) & only do snapshot scan every few scans
  3. cost sensitive cx could want us to have check be age(cur snapshot xmin) <= xmin <= age(prev snapshot xmin) so avoid redundancy at the cost of latency

@@ -239,7 +238,10 @@ func XminFlowWorkflow(
return fmt.Errorf("xmin replication failed: %w", err)
}

state.LastPartition = &protos.QRepPartition{PartitionId: strconv.FormatInt(lastPartition&0xffffffff, 10)}
state.LastPartition = &protos.QRepPartition{
PartitionId: uuid.New().String(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generating a uuid in a workflow is not repeatable and causes state issues, you have to do it as a side-effect, see https://github.com/PeerDB-io/peerdb/blob/main/flow/workflows/cdc_flow.go#L126 for an example side effect.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Went with reusing q.runUUID which is generated a bit earlier in this code

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lol, i just fixed it. feel free to force overwrite.

return totalRecordsFetched, err
}

func (qe *QRepQueryExecutor) ExecuteAndProcessQueryStreamGettingCurrentTxid(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: consider renaming the method

@iskakaushik iskakaushik merged commit 53235aa into main Dec 5, 2023
@serprex serprex deleted the xmin-rep branch December 19, 2023 16:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants