ec2tagger: Unable to retrieve InstanceId. #367

rdonadono · 2022-02-19T17:17:47Z

Hi team,

I'm trying to migrate the EKS metrics and logs from Prometheus to Cloudwatch using this agent but I have some problem.

I have followed this doc and first of all I have attached the policy CloudWatchAgentServerPolicy on my NodeGroups IAM role.

Then I executed this command as indicated from the doc.

ClusterName="<...>"
RegionName="<...>"
FluentBitHttpPort='2020'
FluentBitReadFromHead='Off'
[[ ${FluentBitReadFromHead} = 'On' ]] && FluentBitReadFromTail='Off'|| FluentBitReadFromTail='On'
[[ -z ${FluentBitHttpPort} ]] && FluentBitHttpServer='Off' || FluentBitHttpServer='On'
curl https://raw.githubusercontent.com/aws-samples/amazon-cloudwatch-container-insights/latest/k8s-deployment-manifest-templates/deployment-mode/daemonset/container-insights-monitoring/quickstart/cwagent-fluent-bit-quickstart.yaml | sed 's/{{cluster_name}}/'${ClusterName}'/;s/{{region_name}}/'${RegionName}'/;s/{{http_server_toggle}}/"'${FluentBitHttpServer}'"/;s/{{http_server_port}}/"'${FluentBitHttpPort}'"/;s/{{read_from_head}}/"'${FluentBitReadFromHead}'"/;s/{{read_from_tail}}/"'${FluentBitReadFromTail}'"/' | kubectl apply -f -

Next this step I verified on my EKS cluster if the deamonset works properly but I see them restarting in loop.

Logs of cloudwatch-agent pods show the same error:

2022/02/19 15:26:45 I! 2022/02/19 15:26:42 E! ec2metadata is not available
2022/02/19 15:26:42 I! attempt to access ECS task metadata to determine whether I'm running in ECS.
2022/02/19 15:26:43 W! retry [0/3], unable to get http response from http://169.254.170.2/v2/metadata, error: unable to get response from http://169.254.170.2/v2/metadata, error: Get "http://169.254.170.2/v2/metadata": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
2022/02/19 15:26:44 W! retry [1/3], unable to get http response from http://169.254.170.2/v2/metadata, error: unable to get response from http://169.254.170.2/v2/metadata, error: Get "http://169.254.170.2/v2/metadata": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
2022/02/19 15:26:45 W! retry [2/3], unable to get http response from http://169.254.170.2/v2/metadata, error: unable to get response from http://169.254.170.2/v2/metadata, error: Get "http://169.254.170.2/v2/metadata": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
2022/02/19 15:26:45 I! access ECS task metadata fail with response unable to get response from http://169.254.170.2/v2/metadata, error: Get "http://169.254.170.2/v2/metadata": context deadline exceeded (Client.Timeout exceeded while awaiting headers), assuming I'm not running in ECS.
I! Detected the instance is OnPrem
2022/02/19 15:26:45 Reading json config file path: /opt/aws/amazon-cloudwatch-agent/bin/default_linux_config.json ...
/opt/aws/amazon-cloudwatch-agent/bin/default_linux_config.json does not exist or cannot read. Skipping it.
2022/02/19 15:26:45 Reading json config file path: /etc/cwagentconfig/..2022_02_19_15_26_37.192607950/cwagentconfig.json ...
2022/02/19 15:26:45 Find symbolic link /etc/cwagentconfig/..data 
2022/02/19 15:26:45 Find symbolic link /etc/cwagentconfig/cwagentconfig.json 
2022/02/19 15:26:45 Reading json config file path: /etc/cwagentconfig/cwagentconfig.json ...
Valid Json input schema.
Got Home directory: /root
No csm configuration found.
No metric configuration found.
Configuration validation first phase succeeded
 
2022/02/19 15:26:45 I! Config has been translated into TOML /opt/aws/amazon-cloudwatch-agent/etc/amazon-cloudwatch-agent.toml 
2022-02-19T15:26:45Z I! Starting AmazonCloudWatchAgent 1.247348.0
2022-02-19T15:26:45Z I! Loaded inputs: cadvisor k8sapiserver
2022-02-19T15:26:45Z I! Loaded aggregators: 
2022-02-19T15:26:45Z I! Loaded processors: ec2tagger k8sdecorator
2022-02-19T15:26:45Z I! Loaded outputs: cloudwatchlogs
2022-02-19T15:26:45Z I! Tags enabled: 
2022-02-19T15:26:45Z I! [agent] Config: Interval:1m0s, Quiet:false, Hostname:"ip-192-168-78-202.eu-central-1.compute.internal", Flush Interval:1s
2022-02-19T15:26:45Z I! [logagent] starting
2022-02-19T15:26:45Z I! [logagent] found plugin cloudwatchlogs is a log backend
2022-02-19T15:30:46Z E! [processors.ec2tagger] ec2tagger: Unable to retrieve InstanceId. This plugin must only be used on an EC2 instance
2022-02-19T15:30:46Z E! [telegraf] Error running agent: could not initialize processor ec2tagger: ec2tagger: Unable to retrieve InstanceId. This plugin must only be used on an EC2 instance

I have already verified if the problem is a network problem, but with this command from inside each node I can get the EC2 instance metadata correctly:

TOKEN=`curl -X PUT "http://169.254.169.254/latest/api/token" -H "X-aws-ec2-metadata-token-ttl-seconds: 21600"` && curl -H "X-aws-ec2-metadata-token: $TOKEN" -v http://169.254.169.254/latest/meta-data/instance-id

This error shows up on fluent-bit pods too.

AWS for Fluent Bit Container Image Version 2.10.0
�[1mFluent Bit v1.6.8�[0m
* �[1m�[93mCopyright (C) 2019-2020 The Fluent Bit Authors�[0m
* �[1m�[93mCopyright (C) 2015-2018 Treasure Data�[0m
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

[2022/02/19 15:26:40] [ info] [engine] started (pid=1)
[2022/02/19 15:26:40] [ info] [storage] version=1.0.6, initializing...
[2022/02/19 15:26:40] [ info] [storage] root path '/var/fluent-bit/state/flb-storage/'
[2022/02/19 15:26:40] [ info] [storage] normal synchronization mode, checksum disabled, max_chunks_up=128
[2022/02/19 15:26:40] [ info] [storage] backlog input plugin: storage_backlog.8
[2022/02/19 15:26:40] [ info] [input:systemd:systemd.3] seek_cursor=s=82a20e741bc74377ba38eb0d776ad4dd;i=cb7... OK
[2022/02/19 15:26:40] [ info] [input:storage_backlog:storage_backlog.8] queue memory limit: 4.8M
[2022/02/19 15:26:40] [ info] [input:storage_backlog:storage_backlog.8] register tail.0/1-1645284109.541128008.flb
[2022/02/19 15:26:40] [ info] [input:storage_backlog:storage_backlog.8] register tail.0/1-1645284111.242931767.flb
[2022/02/19 15:26:40] [ info] [input:storage_backlog:storage_backlog.8] register tail.0/1-1645284111.243140450.flb
[2022/02/19 15:26:40] [ info] [input:storage_backlog:storage_backlog.8] register tail.0/1-1645284111.922505134.flb
[2022/02/19 15:26:40] [ info] [input:storage_backlog:storage_backlog.8] register tail.0/1-1645284112.23870133.flb
[2022/02/19 15:26:40] [ info] [input:storage_backlog:storage_backlog.8] register tail.0/1-1645284112.284614847.flb
[2022/02/19 15:26:40] [ info] [input:storage_backlog:storage_backlog.8] register tail.0/1-1645284116.263762778.flb
[2022/02/19 15:26:40] [ info] [input:storage_backlog:storage_backlog.8] register tail.0/1-1645284116.263971979.flb
[2022/02/19 15:26:40] [ info] [input:storage_backlog:storage_backlog.8] register tail.0/1-1645284116.284565147.flb
[2022/02/19 15:26:40] [ info] [input:storage_backlog:storage_backlog.8] register tail.0/1-1645284116.646807894.flb
[2022/02/19 15:26:40] [ info] [input:storage_backlog:storage_backlog.8] register tail.0/1-1645284116.647037408.flb
[2022/02/19 15:26:40] [ info] [input:storage_backlog:storage_backlog.8] register tail.0/1-1645284116.647200365.flb
[2022/02/19 15:26:40] [ info] [filter:kubernetes:kubernetes.0] https=1 host=kubernetes.default.svc port=443
[2022/02/19 15:26:40] [ info] [filter:kubernetes:kubernetes.0] local POD info OK
[2022/02/19 15:26:40] [ info] [filter:kubernetes:kubernetes.0] testing connectivity with API server...
[2022/02/19 15:26:45] [ info] [filter:kubernetes:kubernetes.0] API server connectivity OK
[2022/02/19 15:26:45] [ info] [http_server] listen iface=0.0.0.0 tcp_port=2020
[2022/02/19 15:26:45] [ info] [sp] stream processor started
[2022/02/19 15:26:45] [ info] [input:tail:tail.0] inotify_fs_add(): inode=108010284 watch_fd=1 name=/var/log/containers/aws-load-balancer-controller-859586cf74-rt9ls_kube-system_aws-load-balancer-controller-562ab1af9253b5ca83a3c8acef612683698b7f7ce6ac89da42c1d1277c181f00.log
[...]
[2022/02/19 15:26:46] [error] [filter:aws:aws.2] Could not retrieve ec2 metadata from IMDS
[2022/02/19 15:26:46] [error] [filter:aws:aws.2] Could not retrieve ec2 metadata from IMDS
[2022/02/19 15:26:46] [error] [filter:aws:aws.2] Could not retrieve ec2 metadata from IMDS
[2022/02/19 15:26:46] [error] [filter:aws:aws.2] Could not retrieve ec2 metadata from IMDS
[2022/02/19 15:26:46] [error] [filter:aws:aws.2] Could not retrieve ec2 metadata from IMDS
[2022/02/19 15:26:46] [error] [filter:aws:aws.2] Could not retrieve ec2 metadata from IMDS
[2022/02/19 15:26:46] [error] [filter:aws:aws.2] Could not retrieve ec2 metadata from IMDS
[2022/02/19 15:26:46] [error] [filter:aws:aws.2] Could not retrieve ec2 metadata from IMDS
[2022/02/19 15:26:46] [error] [filter:aws:aws.2] Could not retrieve ec2 metadata from IMDS
[2022/02/19 15:26:46] [ info] [input:tail:tail.4] inotify_fs_add(): inode=52448822 watch_fd=1 name=/var/log/containers/aws-node-mzs4r_kube-system_aws-node-b2ae85e13ca72e02a42ffd3d1832a691a037355de0770945feec31894f27ef3a.log
[...]
[2022/02/19 15:26:46] [error] [filter:aws:aws.3] Could not retrieve ec2 metadata from IMDS
[2022/02/19 15:26:46] [ info] [input:storage_backlog:storage_backlog.8] queueing tail.0:1-1645284109.541128008.flb
[2022/02/19 15:26:46] [ info] [input:storage_backlog:storage_backlog.8] queueing tail.0:1-1645284111.242931767.flb
[2022/02/19 15:26:46] [ info] [input:storage_backlog:storage_backlog.8] queueing tail.0:1-1645284111.243140450.flb
[2022/02/19 15:26:46] [ info] [input:storage_backlog:storage_backlog.8] queueing tail.0:1-1645284111.922505134.flb
[2022/02/19 15:26:46] [ info] [input:storage_backlog:storage_backlog.8] queueing tail.0:1-1645284112.23870133.flb
[2022/02/19 15:26:46] [ info] [input:storage_backlog:storage_backlog.8] queueing tail.0:1-1645284112.284614847.flb
[2022/02/19 15:26:46] [ info] [input:storage_backlog:storage_backlog.8] queueing tail.0:1-1645284116.263762778.flb
[2022/02/19 15:26:46] [ info] [input:storage_backlog:storage_backlog.8] queueing tail.0:1-1645284116.263971979.flb
[2022/02/19 15:26:46] [ info] [input:storage_backlog:storage_backlog.8] queueing tail.0:1-1645284116.284565147.flb
[2022/02/19 15:26:46] [ info] [input:storage_backlog:storage_backlog.8] queueing tail.0:1-1645284116.646807894.flb
[2022/02/19 15:26:46] [ info] [input:storage_backlog:storage_backlog.8] queueing tail.0:1-1645284116.647037408.flb
[2022/02/19 15:26:46] [ info] [input:storage_backlog:storage_backlog.8] queueing tail.0:1-1645284116.647200365.flb
[2022/02/19 15:26:47] [error] [filter:aws:aws.2] Could not retrieve ec2 metadata from IMDS
[2022/02/19 15:26:47] [error] [filter:aws:aws.3] Could not retrieve ec2 metadata from IMDS
[2022/02/19 15:26:50] [ info] [output:cloudwatch_logs:cloudwatch_logs.0] Creating log group /aws/containerinsights/<...>/application
[2022/02/19 15:26:50] [error] [aws_credentials] Could not read shared credentials file /root/.aws/credentials
[2022/02/19 15:26:50] [error] [aws_credentials] Failed to retrieve credentials for AWS Profile default
[2022/02/19 15:26:50] [ warn] [aws_credentials] No cached credentials are available and a credential refresh is already in progress. The current co-routine will retry.
[2022/02/19 15:26:50] [error] [signv4] Provider returned no credentials, service=logs
[2022/02/19 15:26:50] [error] [aws_client] could not sign request

I have a EKS v1.20 cluster created by eksctl and I have 2 NodeGroup, one of OnDemand and one of Spot with same configuration.

What can I do to understand the problem?

Thanks!

The text was updated successfully, but these errors were encountered:

rdonadono · 2022-02-19T20:28:09Z

I assume I have found the source of the problem in this issue.

github-actions · 2022-05-21T00:18:16Z

This issue was marked stale due to lack of activity.

github-actions · 2022-08-25T00:21:42Z

This issue was marked stale due to lack of activity.

dtna7 · 2022-11-01T15:18:37Z

I have the exact issue, after recently new Launch Templates started to disable v1, and only enabled v2. Is there an option we can set to disable, or bypass the reliance on IMDS?

Tomer20 · 2022-12-04T06:47:05Z

I experience the same, any progress with this one?

wolviecb · 2022-12-05T12:40:00Z

I got the same issue with the launch template only enabling V2

github-actions · 2023-03-06T00:19:15Z

This issue was marked stale due to lack of activity.

github-actions · 2023-04-17T00:16:57Z

Closing this because it has stalled. Feel free to reopen if this issue is still relevant, or to ping the collaborator who labeled it stalled if you have any questions.

github-actions bot added the Stale label May 21, 2022

SaxyPandaBear added aws/eks Amazon Elastic Kubernetes Service and removed Stale labels May 21, 2022

khanhntd added the status/investigate label May 24, 2022

SaxyPandaBear removed the status/investigate label May 26, 2022

github-actions bot added the Stale label Aug 25, 2022

SaxyPandaBear removed the Stale label Aug 25, 2022

github-actions bot added the Stale label Mar 6, 2023

github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Apr 17, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ec2tagger: Unable to retrieve InstanceId. #367

ec2tagger: Unable to retrieve InstanceId. #367

rdonadono commented Feb 19, 2022

rdonadono commented Feb 19, 2022

github-actions bot commented May 21, 2022

github-actions bot commented Aug 25, 2022

dtna7 commented Nov 1, 2022

Tomer20 commented Dec 4, 2022

wolviecb commented Dec 5, 2022

github-actions bot commented Mar 6, 2023

github-actions bot commented Apr 17, 2023

ec2tagger: Unable to retrieve InstanceId. #367

ec2tagger: Unable to retrieve InstanceId. #367

Comments

rdonadono commented Feb 19, 2022

rdonadono commented Feb 19, 2022

github-actions bot commented May 21, 2022

github-actions bot commented Aug 25, 2022

dtna7 commented Nov 1, 2022

Tomer20 commented Dec 4, 2022

wolviecb commented Dec 5, 2022

github-actions bot commented Mar 6, 2023

github-actions bot commented Apr 17, 2023