Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PMM-11603 pmm-agent reload. #2410

Merged
merged 41 commits into from
Sep 18, 2023
Merged

PMM-11603 pmm-agent reload. #2410

merged 41 commits into from
Sep 18, 2023

Conversation

JiriCtvrtka
Copy link
Contributor

@JiriCtvrtka JiriCtvrtka commented Aug 14, 2023

PMM-11603

Link to the Feature Build: Percona-Lab/pmm-submodules#3382

Fix issues:

  1. pmm-agent loaded config twice during boot.
  2. pmm-agent went down when using the pmm-admin config subcommand.
  3. pmm-agent can do soft reload.
  4. Nil pointer issue when you cancel agent/process.

@codecov
Copy link

codecov bot commented Aug 14, 2023

Codecov Report

Merging #2410 (64aa322) into main (576142a) will decrease coverage by 0.06%.
The diff coverage is 0.00%.

@@            Coverage Diff             @@
##             main    #2410      +/-   ##
==========================================
- Coverage   42.86%   42.81%   -0.06%     
==========================================
  Files         384      382       -2     
  Lines       48245    47918     -327     
==========================================
- Hits        20681    20516     -165     
+ Misses      25620    25473     -147     
+ Partials     1944     1929      -15     
Flag Coverage Δ
agent 53.15% <0.00%> (-0.13%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Changed Coverage Δ
agent/agentlocal/agent_local.go 22.90% <0.00%> (-0.17%) ⬇️
agent/client/client.go 42.19% <0.00%> (-0.49%) ⬇️
agent/commands/run.go 0.00% <0.00%> (ø)
agent/config/logger.go 0.00% <ø> (ø)

... and 22 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@JiriCtvrtka JiriCtvrtka marked this pull request as ready for review August 16, 2023 08:59
@JiriCtvrtka JiriCtvrtka requested a review from a team as a code owner August 16, 2023 08:59
@JiriCtvrtka JiriCtvrtka requested review from idoqo, artemgavrilov, ademidoff and BupycHuk and removed request for a team August 16, 2023 08:59
@JiriCtvrtka JiriCtvrtka requested a review from BupycHuk August 18, 2023 08:53
Comment on lines 365 to 367
if req == nil {
continue
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will protect only if channel is done, but in case of nil value and opened channel it will fail on nil pointer there: https://github.com/percona/pmm/pull/2410/files#diff-4aa388772db1a19eca352e250851b7785c15d428909601614ef5cc4afb325f22R370

Should I remove it anyway?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when we might have nil value for non-closed channel?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could send it there, but seems there is no place like that. Removed.

Comment on lines -59 to -62
if cfg != nil {
cleanupTmp(cfg.Paths.TempDir, l)
}
cancel()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still need to clean up Tmp directory

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.


for {
_, err = configStorage.Reload(l)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to reload the configs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reload is called when you use pmm-admin config subcommand. So it should not be necessary do it here, or we want to cover also case that someone will modify config manually during its running? Or there another case how you could change config?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we want to cover that as well. We have API endpoint for that reason.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and actually the main reason to have /reload endpoint is to be able to reload configs without restarting pmm-agent

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Am I wrong or reload for reload endpoint should be done by https://github.com/percona/pmm/pull/2410/files#diff-9bb22a7f420e8dfd90097d05536858550d6f7c00dfedada00e0606305d667db1R190 anyway? Thank you.

@JiriCtvrtka JiriCtvrtka requested a review from BupycHuk September 4, 2023 02:27
Comment on lines 115 to 123
select {
case <-reloadCh:
return true
default:
if ctx.Err() != nil {
l.Debug(ctx.Err())
return false
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
select {
case <-reloadCh:
return true
default:
if ctx.Err() != nil {
l.Debug(ctx.Err())
return false
}
}
select {
case <-reloadCh:
return
case <- ctx.Done():
return
default:
}

Comment on lines 101 to 103
if !isReload {
return
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if !isReload {
return
}
select {
case <- ctx.Done():
return
default:
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With current suggested changes pmm-agent will quit in case of reload. Context is done in case of SIGINT but also in case of reload by config subcommand.

Your suggestion:
select {
case <- ctx.Done():
return
default:
}

With both suggestion above we cannot determine which case it is and it will quit main loop in both cases.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed with rootCtx its OK.


for {
_, err = configStorage.Reload(l)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and actually the main reason to have /reload endpoint is to be able to reload configs without restarting pmm-agent

Comment on lines +129 to +132
configFilepath, err := configStorage.Reload(l)
if err != nil {
l.Fatalf("Failed to load configuration: %s.", err)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need to reload right after storage created?

Copy link
Contributor Author

@JiriCtvrtka JiriCtvrtka Sep 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its inside last func before loop https://github.com/percona/pmm/pull/2410/files#diff-ae8858561a752ed8fb97d610805944a2f0123026d91e30fdbcf8c34b3ae80f5bR62

In loop config is already used so it seems its late as possible. If you have an idea feel free to suggest it.

@JiriCtvrtka JiriCtvrtka merged commit 7d74fbe into main Sep 18, 2023
@JiriCtvrtka JiriCtvrtka deleted the PMM-11603-agent-reload branch September 18, 2023 07:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants