Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retry endpoint after the initial connection was refused #230

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

skoef
Copy link
Contributor

@skoef skoef commented May 20, 2023

In kubernetes environments, when the nats-exporter is running as a sidecar, the nats-exporter might be probing the NATS server before it is running. This will prevent collectors for those endpoints to be made.

This PR fixes this by retrying to the NATS server once after it initially failed. This should solve issue #133

@skoef
Copy link
Contributor Author

skoef commented May 20, 2023

@wallyqs perhaps you can take a look?

@skoef
Copy link
Contributor Author

skoef commented Oct 26, 2023

@piotrpio perhaps you could take a look to this PR?

@@ -302,6 +302,13 @@ func (nc *NATSCollector) initMetricsFromServers(namespace string) {
}
}

if response == nil && retryTime > 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will only retry once, which seems may solve the problem sometimes but in general I would think we should do it in a loop to wait until we've got some response.

I think it may be good to add an option like e.g. --max-retries to be able to bail out, but that can be done in a separate PR since it owuld also apply to other places where opts.RetryInterval is used.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants