Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding integ test for otel yaml merging #424

Merged
merged 11 commits into from
Oct 24, 2024
Merged

Conversation

Paramadon
Copy link
Contributor

@Paramadon Paramadon commented Oct 15, 2024

Description of the issue

This pr adds otel yaml merging integ test. We want to make sure that no regression happens in the agent where the yaml merging for the agent stops working. Passing test run: https://github.com/aws/amazon-cloudwatch-agent/actions/runs/11404848694/job/31735108673

Passing test

Screenshot 2024-10-15 at 5 43 13 PM s/11353738985/job/31579557585

Amazon-cloudwatch-agent repo pr: aws/amazon-cloudwatch-agent#1391

Description of changes

Test workflow

  1. Starts the agent with the given config.json below using StartAgent() function
  2. Calls sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a append-config -s -m ec2 -c file:./resources/otel.yaml
  3. Then we do a curl -X POST -H "Content-Type: application/json" -d @metrics.json -i localhost:4318/v1/metrics
    where otlp reciever listens to this and sends it over to cloudwatch using the awsemf exporter.
  4. Then confirm the metric is in the console.

Agent json configuration

{
  "metrics": {
    "metrics_collected": {
      "cpu": {
        "measurement": [
          "cpu_usage_idle"
        ]
      }
    }
  }
}

Agent Yaml:

exporters:
    awscloudwatch:
        force_flush_interval: 1m0s
        max_datums_per_call: 1000
        max_values_per_datum: 150
        middleware: agenthealth/metrics
        namespace: CWAgent
        region: us-west-2
        resource_to_telemetry_conversion:
            enabled: true
extensions:
    agenthealth/metrics:
        is_usage_data_enabled: true
        stats:
            operations:
                - PutMetricData
            usage_flags:
                mode: EC2
                region_type: EC2M
receivers:
    telegraf_cpu:
        collection_interval: 1m0s
        initial_delay: 1s
        timeout: 0s
service:
    extensions:
        - agenthealth/metrics
    pipelines:
        metrics/host:
            exporters:
                - awscloudwatch
            processors: []
            receivers:
                - telegraf_cpu
    telemetry:
        logs:
            development: false
            disable_caller: false
            disable_stacktrace: false
            encoding: console
            level: info
            output_paths:
                - /opt/aws/amazon-cloudwatch-agent/logs/amazon-cloudwatch-agent.log
            sampling:
                enabled: true
                initial: 2
                thereafter: 500
                tick: 10s
        metrics:
            address: ""
            level: None
        traces: {}

Otel Yaml

receivers:
  otlp:
    protocols:
      grpc:
      http:

exporters:
  awsemf/otel-merging:
    namespace: "CWAgent-testing-otel"
    log_group_name: "CWA"
    dimension_rollup_option: "NoDimensionRollup"
    log_stream_name: "Testing-otel"
    resource_to_telemetry_conversion:
      enabled: true
    version: "0"

extensions:
  health_check:

service:
  extensions:
    - health_check
  pipelines:
    metrics:
      receivers: [otlp]
      exporters: [awsemf/otel-merging]

Merged OTEL configuration:

exporters:
    awscloudwatch:
        force_flush_interval: 1m0s
        max_datums_per_call: 1000
        max_values_per_datum: 150
        middleware: agenthealth/metrics
        namespace: CWAgent
        region: us-west-2
        resource_to_telemetry_conversion:
            enabled: true
    awsemf/otel-merging:
        certificate_file_path: ""
        detailed_metrics: false
        dimension_rollup_option: NoDimensionRollup
        disable_metric_extraction: false
        eks_fargate_container_insights_enabled: false
        endpoint: ""
        enhanced_container_insights: false
        imds_retries: 0
        local_mode: false
        log_group_name: CWA
        log_retention: 0
        log_stream_name: Testing-otel
        max_retries: 2
        namespace: CWAgent-testing-otel
        no_verify_ssl: false
        num_workers: 8
        output_destination: cloudwatch
        profile: ""
        proxy_address: ""
        region: ""
        request_timeout_seconds: 30
        resource_arn: ""
        resource_to_telemetry_conversion:
            enabled: true
        retain_initial_value_of_delta_metric: false
        role_arn: ""
        version: "0"
extensions:
    agenthealth/metrics:
        is_usage_data_enabled: true
        stats:
            operations:
                - PutMetricData
            usage_flags:
                mode: EC2
                region_type: EC2M
    health_check:
        check_collector_pipeline:
            enabled: false
            exporter_failure_threshold: 5
            interval: 5m
        endpoint: 0.0.0.0:13133
        include_metadata: false
        max_request_body_size: 0
        path: /
receivers:
    otlp:
        protocols:
            grpc:
                dialer:
                    timeout: 0s
                endpoint: 0.0.0.0:4317
                include_metadata: false
                max_concurrent_streams: 0
                max_recv_msg_size_mib: 0
                read_buffer_size: 524288
                transport: tcp
                write_buffer_size: 0
            http:
                endpoint: 0.0.0.0:4318
                include_metadata: false
                logs_url_path: /v1/logs
                max_request_body_size: 0
                metrics_url_path: /v1/metrics
                traces_url_path: /v1/traces
    telegraf_cpu:
        collection_interval: 1m0s
        initial_delay: 1s
        timeout: 0s
service:
    extensions:
        - agenthealth/metrics
        - health_check
    pipelines:
        metrics:
            exporters:
                - awsemf/otel-merging
            receivers:
                - otlp
        metrics/host:
            exporters:
                - awscloudwatch
            processors: []
            receivers:
                - telegraf_cpu
    telemetry:
        logs:
            development: false
            disable_caller: false
            disable_stacktrace: false
            encoding: console
            error_output_paths:
                - stderr
            level: info
            output_paths:
                - /opt/aws/amazon-cloudwatch-agent/logs/amazon-cloudwatch-agent.log
            sampling:
                enabled: true
                initial: 2
                thereafter: 500
                tick: 10s
        metrics:
            address: ""
            level: None
        traces: {}


License

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Tests

Manually tested this on ec2 instance by running these command with the above configurations:

sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -m ec2 -a stop=
sudo rm /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.yaml 
sudo rm /opt/aws/amazon-cloudwatch-agent/logs/amazon-cloudwatch-agent.log 
sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -s -m ec2 -c file:/opt/aws/amazon-cloudwatch-agent/bin/config.json
sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a append-config -s -m ec2 -c file:/home/ec2-user/test.yaml
sleep 5
cat /opt/aws/amazon-cloudwatch-agent/logs/amazon-cloudwatch-agent.log 
go run goScript.go //which simply just does a curl to otlp reciever to add metrics

Github actions passing test: https://github.com/aws/amazon-cloudwatch-agent/actions/runs/11353738985/job/31579557585

Copy link
Contributor

@zhihonl zhihonl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you link the successful integration test run?

test/agent_otel_merging/agent_otel_merging_test.go Outdated Show resolved Hide resolved
test/agent_otel_merging/agent_otel_merging_test.go Outdated Show resolved Hide resolved
@zhihonl
Copy link
Contributor

zhihonl commented Oct 18, 2024

nit: https://github.com/aws/amazon-cloudwatch-agent/actions/runs/11404848694/job/31735108673 Based on the test run, I see this test is by itself under EC2OtelMerging section. Do we anticipate more tests being added to this section? If not, should we just move this into EC2Linux?

@Paramadon Paramadon merged commit 29be5f4 into main Oct 24, 2024
2 checks passed
@Paramadon Paramadon deleted the agentOtelYamlMerging branch October 24, 2024 13:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants