Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Trimming journal #2890

Open
slinkydeveloper opened this issue Mar 12, 2025 · 5 comments
Open

Trimming journal #2890

slinkydeveloper opened this issue Mar 12, 2025 · 5 comments
Assignees
Labels
ops-ex Restate operational experience

Comments

@slinkydeveloper
Copy link
Contributor

The idea is to let users trim journal given a command, and remove all the commands that happened afterwards + their completions.

Mark & copy up (excluded) the trim point.

@slinkydeveloper slinkydeveloper self-assigned this Mar 12, 2025
@slinkydeveloper slinkydeveloper added the ops-ex Restate operational experience label Mar 12, 2025
@slinkydeveloper
Copy link
Contributor Author

slinkydeveloper commented Mar 12, 2025

The challenge of this feature is the restart mechanism. This is roughly how I'm gonna implement it:

  • Add a new field to invocation status restarts. We start writing this in 1.3, and when not present, its value default to 0 (this makes easy back/front compat)
  • When sending Invoke command to invoker, we send this restarts field. Sending Invoke commands to invoker is a transient message, not written to storage, so all good for versioning
  • When the invoker reads the journal, it reads the invocation status, and when doing so it also reads this restarts field. If the restarts count doesn't match the one of the invoke command, boom this command is invalid and discarded.
  • When the PP sends an Abort command, it means the state machine either transitioned the invocation to End or incremented the restarts count, thus making sure that the invoker will fence off old streams. Also internally in the invoker <-> invocation task communication this field is used too to fence off old messages
  • When the invoker sends InvokerEffect to PP it attaches the restarts count. The restarts count is not written when 0, thus making sure back-compat is easy.
  • If the invoker gets an Invoke with a higher restarts count from the state machine, it aborts the previous one. Essentially either Abort with restarts or Invoke with restarts + 1 wins.

I'm also gonna proceed to remove the Killed state, as it's not needed anymore.

slinkydeveloper added a commit to slinkydeveloper/restate that referenced this issue Mar 12, 2025
@tillrohrmann
Copy link
Contributor

When the PP sends InvokerEffect it attaches the restarts count. The restarts count is not written when 0, thus making sure back-compat is easy.

Did you mean the invoker instead of the PP?

@tillrohrmann
Copy link
Contributor

tillrohrmann commented Mar 12, 2025

Introducing something like an invocation_epoch sounds like a good idea to me. From the top of my head, it should solve the problem that the Killed status tried to solve before in a nicer way.

@tillrohrmann
Copy link
Contributor

fyi @AhmedSoliman

@slinkydeveloper
Copy link
Contributor Author

slinkydeveloper commented Mar 12, 2025

Updating this with new findings. Fencing off invoker effects is not enough, we also need to fence off completions coming from other PPs belonging to old invocation epochs. This is how i plan to do that:

ServiceInvocationResponseSink and friends need to carry around the invocation epoch of the caller invocation.

Then we need to store the following data structure in the caller invocation status:

max_epoch_per_comp_range: map<numeric range of completion_id, maximum inclusive epoch allowed>

This data structure is updated every time we trim accordingly. The invariant of this data structure is that ranges MUST be NON overlapping. This data structure seems to fit https://docs.rs/rangemap/latest/rangemap/inclusive_map/struct.RangeInclusiveMap.html

And then the algorithm when I get a journal entry (which can be either command, completion or signal) is as follows:

on entry:
  if no entry.epoch or entry is signal: accept
  if entry.epoch equal: accept
  if entry.epoch different:
    if entry is command: discard // This is the case of invoker sending commands for old epochs
    if entry is completion:
      if max_epoch_per_comp_range[completion.id] <= entry.epoch: accept
  all the other cases: discard

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ops-ex Restate operational experience
Projects
None yet
Development

No branches or pull requests

2 participants