Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Target allocator Support for the Agent #245

Merged
merged 4 commits into from
Nov 4, 2024
Merged

Conversation

okankoAMZ
Copy link

@okankoAMZ okankoAMZ commented Oct 25, 2024

Title: Merge Collector Improvements and Bug Fixes

Description:
This pull request merges several improvements and bug fixes related to the CloudWatch Agent Collector and Target Allocator components. The changes include:

  1. Split Target Allocator to an internal package for better code organization and maintainability.
  2. Added collectorId injection to Target Allocator Manager to support multiple collectors.
  3. Fixed an issue where a missing marshaler in the Prometheus receiver caused failures during data translation when using YAML tags for Prometheus configuration.
  4. Improved the resilience of the Target Allocator Manager by removing the "return failure" on Start of the thread. This ensures that the Target Allocator Manager can keep trying to ping the Target Allocator pod, preventing the loss of Prometheus metrics in case of startup timing issues.

Testing:

  • Tested the changes with the corresponding Amazon CloudWatch Agent changes (PR #1390) and verified that metrics are being published correctly.
  • Manually tested the Target Allocator Manager changes and confirmed that it keeps retrying when the initial ping fails, ensuring no loss of metrics.

Related PRs:

okankoAMZ and others added 3 commits October 14, 2024 14:49
* Split target allocator into an internal package (open-telemetry#33223)

Fixes
open-telemetry#33146

* basic working copy

* Added k8 test back

---------

Co-authored-by: David Ashpole <[email protected]>
@okankoAMZ okankoAMZ marked this pull request as ready for review October 25, 2024 17:29
@okankoAMZ okankoAMZ requested a review from sky333999 October 25, 2024 17:30
sky333999
sky333999 previously approved these changes Oct 30, 2024
Copy link

@jefchien jefchien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's failing the linter. Please fix those changes. We want to keep it so most tests are passing.

Comment on lines +32 to +34
func getPodName() string {
return os.Getenv("POD_NAME")
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Seems like an unnecessary function. It's only ever called in one place and doesn't make things clearer.

if len(cfgMap) == 0 {
return nil
}
cfgMap["url"] = "http://placeholder" // we have to set it as else marshaling will fail
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be reset after unmarshalling?

@jefchien jefchien merged commit 2091941 into aws-cwa-dev Nov 4, 2024
141 of 146 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants