Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include TLS information in diagnostics bundle #4880

Closed
1 task
ycombinator opened this issue Jun 6, 2024 · 4 comments · Fixed by #4946
Closed
1 task

Include TLS information in diagnostics bundle #4880

ycombinator opened this issue Jun 6, 2024 · 4 comments · Fixed by #4946
Labels
enhancement New feature or request Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team

Comments

@ycombinator
Copy link
Contributor

Describe the enhancement:

Following up on the work @michel-laterman did in elastic/fleet-server#3587 to include information about TLS connections made by Fleet Server, we should similarly include information about TLS connections made by Agent in the diagnostics bundle.

Describe a specific use case for the enhancement or feature:

To be able to debug TLS-related connectivity issues.

What is the definition of done?

@ycombinator ycombinator added enhancement New feature or request Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team labels Jun 6, 2024
@elasticmachine
Copy link
Contributor

Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane)

@cmacknz
Copy link
Member

cmacknz commented Jun 7, 2024

My comment in #4881 (comment) applies here too. There are three network locations agent can be configured to reach and we need to test all of them.

I think it may be better if we implement a test command that can hit each of the three network locations agent needs to function.

I think it may be better if we implement a test command that can hit each of the three network locations agent needs to function. These are:

  • elastic-agent test output that connects to the output in the policy and prints detailed information about what happened.
  • elastic-agent test fleet [options] that contacts Fleet Server with the options provided, or if already enrolled makes a test request using the options persisted in the agent encrypted store.
  • elastic-agent test download that contacts the binary download source and prints detailed information about what happened.

We could have diagnostics attempt each of these by default, with a configurable timeout, and an option to skip these checks.

@michel-laterman
Copy link
Contributor

I think if we want to include connectivity tests by default we should adjust the elastic-agent diagnostics command, as well as the actions the fleet-ui generates to include a new parameter so the hook can be registered with client.RegisterOptionalDiagnosticHook, and using a skip flag/option would not send the param.

Or we can skip executing the connectivity checks by default, and explicitly add the new parameter if we want to gather this data (similar to how CPU metrics are requested).

I don't know if we have any way currently to configure a timeout for diagnostics actions, and the current elastic-agent-client spec/implementation does not have a way to specify an action timeout

@cmacknz
Copy link
Member

cmacknz commented Jun 10, 2024

I think we probably want connectivity checks by default, with an option to skip them. Connectivity problems are a frequent source of support cases so may as well include them by default.

We 100% will need timeouts, diagnostics can't block forever waiting for a reply from an unresponsive server.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants