Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't return error when attempt of sending delta fails to allow next attempt #213

Merged
merged 1 commit into from
Jan 20, 2025

Conversation

kasia-kujawa
Copy link
Contributor

func ExponentialBackoffWithContext(ctx context.Context, backoff Backoff, condition ConditionWithContextFunc) error repeats a condition check with exponential backoff. It immediately returns an error if the condition returns an error, the context is cancelled or hits the deadline,
or if the maximum attempts defined in backoff is exceeded (ErrWaitTimeout).

Agent logs without this change (only one attempt):

time="2025-01-16T15:58:27Z" level=warning msg="failed sending delta request: Post \"https://api--mac-2.local.cast.ai/v1/kubernetes/clusters/36b93485-f902-41a7-a8bc-cb7494d1c56b/agent-deltas\": net/http: request canceled (Client.Timeout exceeded while awaiting headers)" attempts=1 cluster_id=36b93485-f902-41a7-a8bc-cb7494d1c56b component_node_name=gke-kasiak-01-16-default-pool-1c35ddc0-nnn5 component_pod_name=castai-agent-6b6d699ff5-d85w4 full_snapshot=true items=408 provider=gke version=v0.77.0
time="2025-01-16T15:58:27Z" level=error msg="failed sending delta: sending delta request: Post \"https://api--mac-2.local.cast.ai/v1/kubernetes/clusters/36b93485-f902-41a7-a8bc-cb7494d1c56b/agent-deltas\": net/http: request canceled (Client.Timeout exceeded while awaiting headers)" cluster_id=36b93485-f902-41a7-a8bc-cb7494d1c56b component_node_name=gke-kasiak-01-16-default-pool-1c35ddc0-nnn5 component_pod_name=castai-agent-6b6d699ff5-d85w4 controller_id=ee9b4c4c-690d-4a9e-b2b7-1c7887ec7a9b k8s_version=1.30.8-gke.1051000 provider=gke version=v0.77.

Agent logs with this change (3 attempts):

time="2025-01-16T16:06:12Z" level=warning msg="failed sending delta request: Post \"https://api--mac-2.local.cast.ai/v1/kubernetes/clusters/36b93485-f902-41a7-a8bc-cb7494d1c56b/agent-deltas\": net/http: request canceled (Client.Timeout exceeded while awaiting headers)" attempts=1 cluster_id=36b93485-f902-41a7-a8bc-cb7494d1c56b component_node_name=gke-kasiak-01-16-default-pool-1c35ddc0-nnn5 component_pod_name=castai-agent-879b4c747-w6mqx full_snapshot=true items=405 provider=gke version=local
time="2025-01-16T16:06:12Z" level=warning msg="failed sending delta request: Post \"https://api--mac-2.local.cast.ai/v1/kubernetes/clusters/36b93485-f902-41a7-a8bc-cb7494d1c56b/agent-deltas\": io: read/write on closed pipe" attempts=2 cluster_id=36b93485-f902-41a7-a8bc-cb7494d1c56b component_node_name=gke-kasiak-01-16-default-pool-1c35ddc0-nnn5 component_pod_name=castai-agent-879b4c747-w6mqx full_snapshot=true items=405 provider=gke version=local
time="2025-01-16T16:06:12Z" level=warning msg="failed sending delta request: Post \"https://api--mac-2.local.cast.ai/v1/kubernetes/clusters/36b93485-f902-41a7-a8bc-cb7494d1c56b/agent-deltas\": io: read/write on closed pipe" attempts=3 cluster_id=36b93485-f902-41a7-a8bc-cb7494d1c56b component_node_name=gke-kasiak-01-16-default-pool-1c35ddc0-nnn5 component_pod_name=castai-agent-879b4c747-w6mqx full_snapshot=true items=405 provider=gke version=local
time="2025-01-16T16:06:12Z" level=error msg="failed sending delta: timed out waiting for the condition" cluster_id=36b93485-f902-41a7-a8bc-cb7494d1c56b component_node_name=gke-kasiak-01-16-default-pool-1c35ddc0-nnn5 component_pod_name=castai-agent-879b4c747-w6mqx controller_id=9a9adc26-6e57-45b5-a85e-9c10c376f5df k8s_version=1.30.8-gke.1051000 provider=gke version=local

…attempt

func ExponentialBackoffWithContext(ctx context.Context, backoff Backoff, condition ConditionWithContextFunc) error
repeats a condition check with exponential backoff.
It immediately returns an error if the condition returns an error,
the context is cancelled or hits the deadline,
or if the maximum attempts defined in backoff is exceeded (ErrWaitTimeout).
@kasia-kujawa kasia-kujawa merged commit 7af30a5 into main Jan 20, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants