Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[otel] add loadbalancing exporter component #6315

Merged

Conversation

rogercoll
Copy link
Contributor

What does this PR do?

Adds the Otel loadbalancing exporter to the elastic-agent.

Why is it important?

It gives the ability to route logs, metrics or traces to different OTLP endpoints depending on the configured routing_key . This feature can be helpful when distributing the processing load among different collectors. For example, current APM OTLP data is directly processed with the lsminterval, elastictrace and signaltometrics components, but all those components could be moved from the collection/edge collectors to a set of collector that only handle data transformations. One of the main benefits of the latest approach is the separation of concerns, processing collectors would be part of another resource entity that would be managed independetly.

Sample configuration that routes traces based on their service.name key:

      exporters:
        loadbalancing/traces:
          routing_key: "service"
          protocol:
            otlp:
          resolver:
            # use k8s service resolver, if collector runs in kubernetes environment
            k8s:
              service: lb-svc.kube-public

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in ./changelog/fragments using the changelog tool
  • I have added an integration test or an E2E test

Disruptive User Impact

How to test this PR locally

Related issues

Questions to ask yourself

  • How are we going to support this in production?
  • How are we going to measure its adoption?
  • How are we going to debug this?
  • What are the metrics I should take care of?
  • ...

@rogercoll rogercoll requested a review from a team as a code owner December 12, 2024 17:35
Copy link
Contributor

mergify bot commented Dec 12, 2024

This pull request does not have a backport label. Could you fix it @rogercoll? 🙏
To fixup this pull request, you need to add the backport labels for the needed
branches, such as:

  • backport-./d./d is the label to automatically backport to the 8./d branch. /d is the digit

Copy link
Contributor

mergify bot commented Dec 12, 2024

backport-v8.x has been added to help with the transition to the new branch 8.x.
If you don't need it please use backport-skip label and remove the backport-8.x label.

@mergify mergify bot added the backport-8.x Automated backport to the 8.x branch with mergify label Dec 12, 2024
Copy link
Contributor

@swiatekm swiatekm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Just a clarification, you say:

It gives the ability to route logs, metrics or traces to different OTLP endpoints depending on the configured routing_key . This feature can be helpful when distributing the processing load among different collectors. For example, current APM OTLP data is directly processed with the lsminterval, elastictrace and signaltometrics components, but all those components could be moved from the collection/edge collectors to a set of collector that only handle data transformations. One of the main benefits of the latest approach is the separation of concerns, processing collectors would be part of another resource entity that would be managed independetly.

For routing data to different OTLP endpoints, the routing connector is sufficient, and much simpler. What you really want the load balancing exporter for, is ensuring that data for a given partition (based on an attribute, or context) is always routed to the same instance. That's required for tail sampling traces, and for global metrics aggregations, but not for data routing in general.

Is that correct, or am I misunderstanding what we need this component for?

@rogercoll
Copy link
Contributor Author

What you really want the load balancing exporter for, is ensuring that data for a given partition (based on an attribute, or context) is always routed to the same instance.

Correct, thanks for the clarification! At the moment, we are doing some tests using the "service.name" resource attribute as routing key, so their metrics aggregation happens in the same "backend/collector".

This is the configuration:

        loadbalancing/other:
          routing_key: "service"
          protocol:
            otlp:
              # all options from the OTLP exporter are supported
              # except the endpoint
              timeout: 1s
          resolver:
            # use k8s service resolver, if collector runs in kubernetes environment
            k8s:
              service: opentelemetry-kube-stack-lb-collector-headless
              ports:
                - 4317

What we would expect by using this loadbalancing configuration is to forward each service's data to a specific instance of the collector behind the headless K8s service.

@swiatekm
Copy link
Contributor

@rogercoll could you add a changelog fragment and rebase? We've had some issues with the CI today, and the current failure should clear if you do that.

@rogercoll rogercoll force-pushed the add_loadbalancer_otel_exporter branch from bada6c6 to 23e7e58 Compare December 16, 2024 16:24
@swiatekm swiatekm added the Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team label Dec 31, 2024
@elasticmachine
Copy link
Contributor

Pinging @elastic/elastic-agent-control-plane (Team:Elastic-Agent-Control-Plane)

@michalpristas michalpristas merged commit dbfd447 into elastic:main Jan 2, 2025
14 checks passed
@michalpristas michalpristas added the backport-8.17 Automated backport with mergify label Jan 2, 2025
mergify bot pushed a commit that referenced this pull request Jan 2, 2025
* feat: add loadbalancing Otel exporter

* chore: add fragments changelog file

---------

Co-authored-by: Andrzej Stencel <[email protected]>
(cherry picked from commit dbfd447)
mergify bot pushed a commit that referenced this pull request Jan 2, 2025
* feat: add loadbalancing Otel exporter

* chore: add fragments changelog file

---------

Co-authored-by: Andrzej Stencel <[email protected]>
(cherry picked from commit dbfd447)

# Conflicts:
#	go.mod
#	internal/pkg/otel/components.go
@rogercoll rogercoll deleted the add_loadbalancer_otel_exporter branch January 2, 2025 12:08
michalpristas pushed a commit that referenced this pull request Jan 2, 2025
* feat: add loadbalancing Otel exporter

* chore: add fragments changelog file

---------

Co-authored-by: Andrzej Stencel <[email protected]>
(cherry picked from commit dbfd447)

Co-authored-by: Roger Coll <[email protected]>
michalpristas added a commit that referenced this pull request Jan 2, 2025
…6468)

* [otel] add loadbalancing exporter component (#6315)

* feat: add loadbalancing Otel exporter

* chore: add fragments changelog file

---------

Co-authored-by: Andrzej Stencel <[email protected]>
(cherry picked from commit dbfd447)

# Conflicts:
#	go.mod
#	internal/pkg/otel/components.go

* conflicts

---------

Co-authored-by: Roger Coll <[email protected]>
Co-authored-by: Michal Pristas <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-8.x Automated backport to the 8.x branch with mergify backport-8.17 Automated backport with mergify Team:Elastic-Agent-Control-Plane Label for the Agent Control Plane team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants