Added skip audit/unenroll flag to uninstall command #6206

Rohit-code14 · 2024-12-04T03:57:08Z

Enhancement

What does this PR do?

Added a flag to skip fleet audit/unenroll while uninstalling elastic-agent.

Why is it important?

While uninstalling elastic-agent it tries to notify fleet server about the uninstall. But in somecases users might know that the fleet server is unreachable and this notification logs multiple failures continuously. Therefore having an option to skip this would enhance the end user experience.

Checklist

My code follows the style guidelines of this project
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
~~[ ] I have made corresponding change to the default configuration files~~
I have added tests that prove my fix is effective or that my feature works
I have added an entry in ./changelog/fragments using the changelog tool
~~[ ] I have added an integration test or an E2E test~~

Disruptive User Impact

No disruptive user impact

How to test this PR locally

Setup ES and Kibana.
Proceed with Fleetserver installation.
Build the agent locally by running SNAPSHOT=true PLATFORMS=linux/arm64 PACKAGES=tar.gz mage -v package
Extract the agent tar.
Go to Kibana -> Fleet page -> Add Agent gives installation command. Use the same to install elastic-agent.
After successful installation, Check in fleet page for the installed agent.
Kill the fleet servers process.
Try uninstalling elastic-agent using sudo elastic-agent uninstall
Now as the fleet server is unreachable, the notify calls will fail. You will get the error logs like below
[== ] notify Fleet: network error: fail to notify audit/unenroll on fleet-server: all hosts failed: requester 0/1 to host https://192.168.1.10:8220/ errored: Post "https://192.168.1.10:8220/api/fleet/agents/a4888371-ef7b-4cc6-8df7-cc92b6c25990/audit/unenroll?": dial tcp 192.168.1.10:8220: connect:
Now restart the fleet server and install the elastic agent again.
After successful installation, kill the fleet server.
Now try uninstalling elastic-agent using sudo elastic-agent uninstall --skip-fleet-audit. The audit/unenroll call will be skipped and agent uninstall will happen without any errors.

Related issues

Closes Add skip audit/unenroll flag to uninstall command #5757

mergify · 2024-12-04T03:57:48Z

This pull request does not have a backport label. Could you fix it @Rohit-code14? 🙏
To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

backport-./d./d is the label to automatically backport to the 8./d branch. /d is the digit

mergify · 2024-12-04T03:57:49Z

backport-v8.x has been added to help with the transition to the new branch 8.x.
If you don't need it please use backport-skip label and remove the backport-8.x label.

pkoutsovasilis · 2024-12-05T15:26:50Z

/test

kaanyalti

@Rohit-code14 Could you please add tests relevant to what you are doing here? One suggestion is to inject notifyFleetAuditUninstall function. You can create a function type and create a mock with it using mockery which you can use in your test.

Rohit-code14 · 2024-12-10T15:38:56Z

@Rohit-code14 Could you please add tests relevant to what you are doing here? One suggestion is to inject notifyFleetAuditUninstall function. You can create a function type and create a mock with it using mockery which you can use in your test.

Hi @kaanyalti , Even if we inject notifyFleetAuditUninstall into Uninstall and try mocking the same to test it. To test it we need to directly invoke Uninstall function by passing the mock notify function during the test run. Which is not hard to do. I think we can refactor the pre-validations that are done to before invoking notifyFleetAuditUninstall function into a separate function and have a test case that mocks the new refactored function.
Something like this

func notifyFleetIfNeeded(ctx context.Context, log *logp.Logger, pt *progressbar.ProgressBar, cfg *configuration.Configuration, ai *info.AgentInfo, skipFleetAudit bool, notifyFleetAuditUninstall NotifyFleetAuditUninstall) {
// Once the root-cause is identified then this can be re-enabled on Windows.
if ai != nil && cfg != nil && !skipFleetAudit && runtime.GOOS != "windows" {
notifyFleetAuditUninstall(ctx, log, pt, cfg, ai) //nolint:errcheck // ignore the error as we can't act on it)
}
}

Please share your opinion on this

kaanyalti · 2024-12-10T16:58:42Z

@Rohit-code14 I am having a hard time visualizing your suggestion based on the example you provided. Can you either implement your suggestion or provide a more concrete example so it is clearer?

Rohit-code14 · 2024-12-10T18:27:18Z

@Rohit-code14 I am having a hard time visualizing your suggestion based on the example you provided. Can you either implement your suggestion or provide a more concrete example so it is clearer?

@kaanyalti I have added changes and testcase, kindly review the same.

kaanyalti

Tested this on mac and linux, works well. Changes look good to me.

pkoutsovasilis · 2024-12-17T10:34:51Z

/test

Rohit-code14 · 2024-12-17T17:28:45Z

Hi @kaanyalti , I cannot see the reasons for these failures, Can you please help me with this?

pkoutsovasilis · 2024-12-17T17:33:00Z

Hi @kaanyalti , I cannot see the reasons for these failures, Can you please help me with this?

Hey @Rohit-code14 this is the failure 🙂

log_level_test.go:140:
--
  | Error Trace:	/opt/buildkite-agent/builds/bk-agent-prod-gcp-1734443215713220624/elastic/elastic-agent-extended-testing-bk/testing/integration/log_level_test.go:140
  | /opt/buildkite-agent/builds/bk-agent-prod-gcp-1734443215713220624/elastic/elastic-agent-extended-testing-bk/testing/integration/log_level_test.go:76
  | Error:      	Condition never satisfied
  | Test:       	TestSetLogLevelFleetManaged
  | Messages:   	agent never communicated agent-specific log level "debug" to Fleet

Rohit-code14 · 2024-12-17T17:40:39Z

Hi @kaanyalti , I cannot see the reasons for these failures, Can you please help me with this?

Hey @Rohit-code14 this is the failure 🙂

log_level_test.go:140:
--
  | Error Trace:	/opt/buildkite-agent/builds/bk-agent-prod-gcp-1734443215713220624/elastic/elastic-agent-extended-testing-bk/testing/integration/log_level_test.go:140
  | /opt/buildkite-agent/builds/bk-agent-prod-gcp-1734443215713220624/elastic/elastic-agent-extended-testing-bk/testing/integration/log_level_test.go:76
  | Error:      	Condition never satisfied
  | Test:       	TestSetLogLevelFleetManaged
  | Messages:   	agent never communicated agent-specific log level "debug" to Fleet

Hey, I don't think its due to my changes. What should i do now🤔

pkoutsovasilis · 2024-12-17T17:45:23Z

Hey, I don't think its due to my changes. What should i do now🤔

I don’t believe this is caused by your changes either, but let me double-check with the team and get back to you. 🙂

Rohit-code14 · 2024-12-17T17:52:38Z

Hey, I don't think its due to my changes. What should i do now🤔

I don’t believe this is caused by your changes either, but let me double-check with the team and get back to you. 🙂

Thanks!
One small query..we should / shouldn't pull changes from main branch after raising PR. As its triggering the workflow actions everytime i pull changes. I would usually pull main branch changes but not sure on what is followed here. Sorry it may be a dumb question, new to opensource contribution.

ycombinator · 2024-12-17T17:58:31Z

One small query..we should / shouldn't pull changes from main branch after raising PR.

It's acceptable to merge / rebase main into PRs, particularly when CI is failing due to reasons unrelated to the changes in your PR. There's a chance the failure is also happening on main and, if it has been fixed there, it would make sense to pull that fix into your PR's branch.

kaanyalti · 2024-12-18T02:23:18Z

@Rohit-code14 Could you please resolve the merge conflicts?

Rohit-code14 · 2024-12-18T03:15:54Z

@Rohit-code14 Could you please resolve the merge conflicts?

Done!

pkoutsovasilis · 2024-12-18T09:01:26Z

/test

elastic-sonarqube · 2024-12-18T09:53:58Z

Quality Gate failed

Failed conditions
12.5% Coverage on New Code (required ≥ 40%)

See analysis details on SonarQube

Rohit-code14 · 2024-12-18T13:32:13Z

Quality Gate failed

Failed conditions 12.5% Coverage on New Code (required ≥ 40%)

See analysis details on SonarQube

Should anything be done from my side for this?

kaanyalti · 2024-12-18T20:07:12Z

@Rohit-code14 I'm going to look into why sonarqube is complaining, you may have to increase code coverage, although right now what you have implemented looks good to me.

Rohit-code14 · 2024-12-19T02:27:28Z

@Rohit-code14 I'm going to look into why sonarqube is complaining, you may have to increase code coverage, although right now what you have implemented looks good to me.

Okay let me know once you have checked the it.

internal/pkg/agent/install/uninstall_test.go

Added skip audit/unenroll flag to uninstall command

888e63f

mergify bot assigned Rohit-code14 Dec 4, 2024

mergify bot added the backport-8.x Automated backport to the 8.x branch with mergify label Dec 4, 2024

Rohit-code14 marked this pull request as ready for review December 4, 2024 03:59

Rohit-code14 requested a review from a team as a code owner December 4, 2024 03:59

Rohit-code14 requested review from pkoutsovasilis and kaanyalti December 4, 2024 03:59

Merge branch 'elastic:main' into main

b2983da

Rohit-code14 mentioned this pull request Dec 5, 2024

Updated docs about the new flag --skip-fleet-audit to skip fleet audit/unenroll during agent uninstall elastic/ingest-docs#1525

Open

kaanyalti reviewed Dec 6, 2024

View reviewed changes

Rohit-code14 and others added 5 commits December 10, 2024 23:25

Added tests for skip-fleet-audit flag

b5fb853

Merge branch 'elastic:main' into main

b8405a0

Added tests for skip-fleet-audit flag

b659cc2

Merge branch 'main' of https://github.com/Rohit-code14/elastic-agent

848e325

Added tests for skip-fleet-audit flag

17bd84d

Rohit-code14 added 2 commits December 11, 2024 23:57

Merge branch 'elastic:main' into main

a6d5bd9

Merge branch 'main' into main

2d1d8b4

Rohit-code14 requested a review from kaanyalti December 12, 2024 06:47

Rohit-code14 and others added 3 commits December 13, 2024 09:02

Merge branch 'main' into main

8d57889

ran mage fmt

0e12052

Merge branch 'main' into main

b0b048c

kaanyalti approved these changes Dec 17, 2024

View reviewed changes

Merge branch 'main' into main

2b8165b

Resolved conflicts

8874f84

michalpristas reviewed Dec 19, 2024

View reviewed changes

internal/pkg/agent/install/uninstall_test.go Outdated Show resolved Hide resolved

Rohit-code14 and others added 3 commits December 19, 2024 22:21

Merge branch 'main' into main

e058a38

Test all combinations for which fleet audit/unenroll will be skipped

9d83f96

Merge branch 'main' of https://github.com/Rohit-code14/elastic-agent

e2cfb10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added skip audit/unenroll flag to uninstall command #6206

Added skip audit/unenroll flag to uninstall command #6206

Rohit-code14 commented Dec 4, 2024 •

edited

Loading

mergify bot commented Dec 4, 2024

mergify bot commented Dec 4, 2024

pkoutsovasilis commented Dec 5, 2024

kaanyalti left a comment •

edited

Loading

Rohit-code14 commented Dec 10, 2024

kaanyalti commented Dec 10, 2024

Rohit-code14 commented Dec 10, 2024

kaanyalti left a comment

pkoutsovasilis commented Dec 17, 2024

Rohit-code14 commented Dec 17, 2024

pkoutsovasilis commented Dec 17, 2024

Rohit-code14 commented Dec 17, 2024

pkoutsovasilis commented Dec 17, 2024

Rohit-code14 commented Dec 17, 2024

ycombinator commented Dec 17, 2024

kaanyalti commented Dec 18, 2024

Rohit-code14 commented Dec 18, 2024

pkoutsovasilis commented Dec 18, 2024

elastic-sonarqube bot commented Dec 18, 2024

Rohit-code14 commented Dec 18, 2024

Quality Gate failed

kaanyalti commented Dec 18, 2024

Rohit-code14 commented Dec 19, 2024

Added skip audit/unenroll flag to uninstall command #6206

Are you sure you want to change the base?

Added skip audit/unenroll flag to uninstall command #6206

Conversation

Rohit-code14 commented Dec 4, 2024 • edited Loading

What does this PR do?

Why is it important?

Checklist

Disruptive User Impact

How to test this PR locally

Related issues

mergify bot commented Dec 4, 2024

mergify bot commented Dec 4, 2024

pkoutsovasilis commented Dec 5, 2024

kaanyalti left a comment • edited Loading

Choose a reason for hiding this comment

Rohit-code14 commented Dec 10, 2024

kaanyalti commented Dec 10, 2024

Rohit-code14 commented Dec 10, 2024

kaanyalti left a comment

Choose a reason for hiding this comment

pkoutsovasilis commented Dec 17, 2024

Rohit-code14 commented Dec 17, 2024

pkoutsovasilis commented Dec 17, 2024

Rohit-code14 commented Dec 17, 2024

pkoutsovasilis commented Dec 17, 2024

Rohit-code14 commented Dec 17, 2024

ycombinator commented Dec 17, 2024

kaanyalti commented Dec 18, 2024

Rohit-code14 commented Dec 18, 2024

pkoutsovasilis commented Dec 18, 2024

elastic-sonarqube bot commented Dec 18, 2024

Quality Gate failed

Rohit-code14 commented Dec 18, 2024

Quality Gate failed

kaanyalti commented Dec 18, 2024

Rohit-code14 commented Dec 19, 2024

Rohit-code14 commented Dec 4, 2024 •

edited

Loading

kaanyalti left a comment •

edited

Loading