-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Only trim multi-node clusters when Archived LSN is reported #2814
base: main
Are you sure you want to change the base?
Only trim multi-node clusters when Archived LSN is reported #2814
Conversation
This change makes it so that in Restate deployments with more than one node, a snapshot repository is a requirement for auto-trimming of logs. Previously we would have trimmed logs based on the reported persisted LSN, which isn't always safe considering that additional nodes may be added at a later time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for hardening the log trim strategies @pcholakov. It's a bit of a pity that we can't fully guard against the case where trimming happens on a single node based on persisted lsn and then the user wants to expand its cluster :-( Apart from this, I think the changes make sense to me.
@@ -1106,7 +1160,7 @@ mod tests { | |||
let applied_lsn = Arc::new(AtomicU64::new(0)); | |||
let persisted_lsn = Arc::new(AtomicU64::new(0)); | |||
|
|||
let (_node_env, bifrost, _) = create_test_env(config, |builder| { | |||
let (_node_env, bifrost, _) = create_test_env_with_nodes(config, 2, |builder| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test name is a bit confusing. Shouldn't we always stop trimming when running with multiple nodes and not having an archived lsn?
config: Configuration, | ||
n: u8, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can spare a few more letters to give it a more expressive name.
@@ -367,24 +353,19 @@ fn create_log_trim_check_interval(options: &AdminOptions) -> Option<Interval> { | |||
} | |||
|
|||
enum TrimMode { | |||
/// Trim logs by the reported persisted LSN. This strategy is only appropriate for | |||
/// single-node deplyments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// single-node deplyments. | |
/// single-node deployments. |
if snapshots_repository_configured | ||
|| Self::any_processors_report_archived_lsn(cluster_state) | ||
{ | ||
if snapshots_repository_configured || cluster_state.nodes.len() > 1 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be great if we could make users aware of the fact that trimming won't work when running w/o a snapshot repository and multi node deployment.
// if let Some(blocking_nodes) = self.get_trim_blockers() { | ||
// return Err(TrimPointsUnavailable::UnknownPartitionStatus( | ||
// blocking_nodes.clone(), | ||
// )); | ||
// } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can probably be removed.
@@ -905,7 +907,7 @@ impl PartitionProcessorManager { | |||
/// | |||
/// This allows multiple partition processors to be started concurrently without holding | |||
/// and exclusive lock to [`Self`] | |||
fn start_partition_processor_task( | |||
fn crate_start_partition_processor_task( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fn crate_start_partition_processor_task( | |
fn create_start_partition_processor_task( |
@@ -294,18 +294,32 @@ async fn create_or_recreate_store( | |||
(maybe_snapshot, Some(fast_forward_lsn)) => { | |||
// Play it safe and keep the partition store intact; we can't do much else at this | |||
// point. We'll likely halt again as soon as the processor starts up. | |||
let recovery_guide_msg = | |||
"The partition's log is trimmed to a point from which this processor can not resume. | |||
Visit https://docs.restate.dev/operate/troubleshooting#handling-missing-snapshots |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Depending on where the troubleshooting guide will be place, this probably needs updating.
if let Some(snapshot) = maybe_snapshot { | ||
warn!( | ||
%partition_id, | ||
%snapshot.min_applied_lsn, | ||
%fast_forward_lsn, | ||
"Latest available snapshot is older than the the fast-forward target LSN!", | ||
"The latest available snapshot is from an LSN before the target LSN! {}", | ||
recovery_guide_msg, | ||
); | ||
} else if snapshot_repository.is_none() { | ||
warn!( | ||
%partition_id, | ||
%fast_forward_lsn, | ||
"A log trim gap was encountered, but no snapshot repository is configured! {}", | ||
recovery_guide_msg, | ||
); | ||
} else { | ||
warn!( | ||
%partition_id, | ||
%fast_forward_lsn, | ||
"A fast-forward target LSN is set, but no snapshot available for partition!", | ||
"A log trim gap was encountered, but no snapshot is available for this partition! {}", | ||
recovery_guide_msg, | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's helpful for our users :-)
This change makes it so that in Restate deployments with more than one node, a snapshot repository is a requirement for auto-trimming of logs. Previously we would have trimmed logs based on the reported persisted LSN, which isn't always safe considering that additional nodes may be added at a later time.
Related docs PR: restatedev/documentation#556
Related: #2808