Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding integ test for agent otel config merging #423

Closed
wants to merge 1 commit into from

Conversation

Paramadon
Copy link
Contributor

@Paramadon Paramadon commented Oct 15, 2024

Description of the issue

This pr adds otel yaml merging integ test. We want to make sure that no regression happens in the agent where the yaml merging for the agent stops working. Passing test run: https://github.com/aws/amazon-cloudwatch-agent/actions/runs/11353738985/job/31579557585

Passing test

Screenshot 2024-10-15 at 5 43 13 PM s/11353738985/job/31579557585

Amazon-cloudwatch-agent repo pr: aws/amazon-cloudwatch-agent#1391

Description of changes

Test workflow

  1. Starts the agent with the given config.json below using StartAgent() function
  2. Calls sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a append-config -s -m ec2 -c file:./resources/otel.yaml
  3. Then we do a curl -X POST -H "Content-Type: application/json" -d @metrics.json -i localhost:4318/v1/metrics
    where otlp reciever listens to this and sends it over to cloudwatch using the awsemf exporter.
  4. Then confirm the metric is in the console.

Agent configuration (have one metric being collected in order to generate agent yaml to be merged):

{
  "metrics": {
    "metrics_collected": {
      "cpu": {
        "measurement": [
          "cpu_usage_idle"
        ]
      }
    }
  }
}

Agent Yaml:

exporters:
    awscloudwatch:
        force_flush_interval: 1m0s
        max_datums_per_call: 1000
        max_values_per_datum: 150
        middleware: agenthealth/metrics
        namespace: CWAgent
        region: us-west-2
        resource_to_telemetry_conversion:
            enabled: true
extensions:
    agenthealth/metrics:
        is_usage_data_enabled: true
        stats:
            operations:
                - PutMetricData
            usage_flags:
                mode: EC2
                region_type: EC2M
receivers:
    telegraf_cpu:
        collection_interval: 1m0s
        initial_delay: 1s
        timeout: 0s
service:
    extensions:
        - agenthealth/metrics
    pipelines:
        metrics/host:
            exporters:
                - awscloudwatch
            processors: []
            receivers:
                - telegraf_cpu
    telemetry:
        logs:
            development: false
            disable_caller: false
            disable_stacktrace: false
            encoding: console
            level: info
            output_paths:
                - /opt/aws/amazon-cloudwatch-agent/logs/amazon-cloudwatch-agent.log
            sampling:
                enabled: true
                initial: 2
                thereafter: 500
                tick: 10s
        metrics:
            address: ""
            level: None
        traces: {}

Otel Yaml

receivers:
  otlp:
    protocols:
      grpc:
      http:

exporters:
  awsemf/otel-merging:
    namespace: "CWAgent-testing-otel"
    log_group_name: "CWA"
    dimension_rollup_option: "NoDimensionRollup"
    log_stream_name: "Testing-otel"
    resource_to_telemetry_conversion:
      enabled: true
    version: "0"

service:
  pipelines:
    metrics:
      receivers: [otlp]
      exporters: [awsemf/otel-merging]

Merged OTEL configuration:

exporters:
    awscloudwatch:
        force_flush_interval: 1m0s
        max_datums_per_call: 1000
        max_values_per_datum: 150
        middleware: agenthealth/metrics
        namespace: CWAgent
        region: us-west-2
        resource_to_telemetry_conversion:
            enabled: true
    awsemf/otel-merging:
        certificate_file_path: ""
        detailed_metrics: false
        dimension_rollup_option: NoDimensionRollup
        disable_metric_extraction: false
        eks_fargate_container_insights_enabled: false
        endpoint: ""
        enhanced_container_insights: false
        imds_retries: 0
        local_mode: false
        log_group_name: CWA
        log_retention: 0
        log_stream_name: Testing-otel
        max_retries: 2
        namespace: CWAgent-testing-otel
        no_verify_ssl: false
        num_workers: 8
        output_destination: cloudwatch
        profile: ""
        proxy_address: ""
        region: ""
        request_timeout_seconds: 30
        resource_arn: ""
        resource_to_telemetry_conversion:
            enabled: true
        retain_initial_value_of_delta_metric: false
        role_arn: ""
        version: "0"
extensions:
    agenthealth/metrics:
        is_usage_data_enabled: true
        stats:
            operations:
                - PutMetricData
            usage_flags:
                mode: EC2
                region_type: EC2M
receivers:
    otlp:
        protocols:
            grpc:
                dialer:
                    timeout: 0s
                endpoint: 0.0.0.0:4317
                include_metadata: false
                max_concurrent_streams: 0
                max_recv_msg_size_mib: 0
                read_buffer_size: 524288
                transport: tcp
                write_buffer_size: 0
            http:
                endpoint: 0.0.0.0:4318
                include_metadata: false
                logs_url_path: /v1/logs
                max_request_body_size: 0
                metrics_url_path: /v1/metrics
                traces_url_path: /v1/traces
    telegraf_cpu:
        collection_interval: 1m0s
        initial_delay: 1s
        timeout: 0s
service:
    extensions:
        - agenthealth/metrics
    pipelines:
        metrics:
            exporters:
                - awsemf/otel-merging
            receivers:
                - otlp
        metrics/host:
            exporters:
                - awscloudwatch
            processors: []
            receivers:
                - telegraf_cpu
    telemetry:
        logs:
            development: false
            disable_caller: false
            disable_stacktrace: false
            encoding: console
            error_output_paths:
                - stderr
            level: info
            output_paths:
                - /opt/aws/amazon-cloudwatch-agent/logs/amazon-cloudwatch-agent.log
            sampling:
                enabled: true
                initial: 2
                thereafter: 500
                tick: 10s
        metrics:
            address: ""
            level: None
        traces: {}

License

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Tests

Manually tested this on ec2 instance by running these command with the above configurations:

sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -m ec2 -a stop=
sudo rm /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.yaml 
sudo rm /opt/aws/amazon-cloudwatch-agent/logs/amazon-cloudwatch-agent.log 
sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a fetch-config -s -m ec2 -c file:/opt/aws/amazon-cloudwatch-agent/bin/config.json
sudo /opt/aws/amazon-cloudwatch-agent/bin/amazon-cloudwatch-agent-ctl -a append-config -s -m ec2 -c file:/home/ec2-user/test.yaml
sleep 5
cat /opt/aws/amazon-cloudwatch-agent/logs/amazon-cloudwatch-agent.log 
go run goScript.go //which simply just does a curl to otlp reciever to add metrics

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant