Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add retry mechanic to worker-specific-task-queue sample #383

Merged
merged 4 commits into from
Jan 28, 2025

Conversation

yuandrew
Copy link
Contributor

@yuandrew yuandrew commented Jan 24, 2025

What was changed

Added an example of a retry mechanic to worker-specific-task-queue

Why?

Common scenario we want an example for.

Checklist

  1. Closes [Feature Request] Demonstrate retries in worker-specific-task-queue sample #376

  2. How was this tested:

Tested locally, after running a worker and the starter code. If worker goes down 5 times in a row during an activity and ScheduleToClose window passes, worker prints "Workflow failed after multiple retries" and starter code errors out.

Added unit tests demonstrating retry mechanic

  1. Any docs updates needed?

Copy link
Member

@cretz cretz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing blocking, but would like @Quinn-With-Two-Ns to look

@@ -11,6 +11,18 @@ import (
// FileProcessingWorkflow is a workflow that uses Worker-specific Task Queues to run multiple Activities on a consistent
// host.
func FileProcessingWorkflow(ctx workflow.Context) (err error) {
for attempt := 1; attempt <= 5; attempt++ {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't have to, but may be worth a little comment here on why there is a loop around this whole "process".

@@ -11,6 +11,18 @@ import (
// FileProcessingWorkflow is a workflow that uses Worker-specific Task Queues to run multiple Activities on a consistent
// host.
func FileProcessingWorkflow(ctx workflow.Context) (err error) {
for attempt := 1; attempt <= 5; attempt++ {
if err = processFile(ctx); err == nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Arguably one might only retry on schedule to close timeout, but this is probably fine too.

### Things to try
You can try to intentionally crash Workers while they are doing work to see what happens when work gets "stuck" in a unique queue: currently the Workflow will `scheduleToCloseTimeout` without a Worker, and retry when a Worker comes back online.

After the 5th attempt, it logs `Workflow failed after multiple session retries.` and exits. But you may wish to implement compensatory logic, including notifying you.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't log Workflow failed after multiple session retries. it logs Workflow failed after multiple retries. no?

@@ -11,6 +11,18 @@ import (
// FileProcessingWorkflow is a workflow that uses Worker-specific Task Queues to run multiple Activities on a consistent
// host.
func FileProcessingWorkflow(ctx workflow.Context) (err error) {
for attempt := 1; attempt <= 5; attempt++ {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: could use range over int here

@@ -28,3 +28,8 @@ Start the Workflow Execution:
```bash
go run worker-specific-task-queues/starter/main.go
```

### Things to try
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think we can add a unit test to show the retry working? Not blocking but would be nice to show in a unit test

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, added!

@yuandrew yuandrew merged commit ea702ad into temporalio:main Jan 28, 2025
3 checks passed
@yuandrew yuandrew deleted the worker-specific-task-queue-retries branch January 28, 2025 17:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature Request] Demonstrate retries in worker-specific-task-queue sample
3 participants