Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aws: add support for EKS Pod Identities #9206

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

PettitWesley
Copy link
Contributor


Enter [N/A] in the box, if an item is not applicable to your change.

Testing
Before we can approve your change; please submit the following in a comment:

  • Example configuration file for the change
  • Debug log output from testing the change
  • Attached Valgrind output that shows no leaks or memory corruption was found

If this is a change to packaging of containers or native binaries then please confirm it works for all targets.

  • Run local packaging test showing all targets (including any new ones) build.
  • Set ok-package-test label to test for all targets (requires maintainer to do).

Documentation

  • Documentation required for this feature

Backporting

  • Backport to latest stable release.

Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

    This change brings the http credential provider
    in line with the latest spec and adds support for:
    - EKS Pod Identity
      - validate/support EKS credential link local IP 169.254.170.23
    - Latest HTTP Provider spec:
      - AWS_CONTAINER_CREDENTIALS_RELATIVE_URI
      - AWS_CONTAINER_CREDENTIALS_FULL_URI
      - AWS_CONTAINER_AUTHORIZATION_TOKEN
      - AWS_CONTAINER_AUTHORIZATION_TOKEN_FILE

Signed-off-by: Wesley Pettit <[email protected]>
@PettitWesley
Copy link
Contributor Author

@edsiper I tested these changes thoroughly on a new EKS cluster back in May. My change has unit tests which pass. It should be safe to merge after I just now performed a simple rebase with no conflicts. Unfortunately I am unable test this again right now.

Please see my comments on the alternate (mostly the same) implementation: #9013 (review)

@edsiper
Copy link
Member

edsiper commented Aug 27, 2024

@edsiper
Copy link
Member

edsiper commented Aug 27, 2024

@PettitWesley @iandrewt

In the branch eks-pod-identity3.0 I pushed some commits on top of this branch/PR to fix the leaks found. The patch in order are:

Remaining issues found with Valgrind:

valgrind --leak-check=full bin/flb-it-aws_credentials_http
Test test_http_validator_invalid_host...        [ FAILED ]
  aws_credentials_http.c:728: Check provider == NULL... failed
==964174== Warning: invalid file descriptor -1 in syscall close()
Test test_http_validator_invalid_port...        [ FAILED ]
  aws_credentials_http.c:757: Check provider == NULL... failed
==964174== Warning: invalid file descriptor -1 in syscall close()
FAILED: 2 of 11 unit tests have failed.
==964174==
==964174== HEAP SUMMARY:
==964174==     in use at exit: 19,024 bytes in 9 blocks
==964174==   total heap usage: 17,168 allocs, 17,159 frees, 1,808,409 bytes allocated
==964174==
==964174== 212 (96 direct, 116 indirect) bytes in 1 blocks are definitely lost in loss record 7 of 9
==964174==    at 0x484D953: calloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==964174==    by 0x16D064: flb_calloc (include/fluent-bit/flb_mem.h:95)
==964174==    by 0x16D47F: flb_endpoint_provider_create (src/aws/flb_aws_credentials_http.c:265)
==964174==    by 0x16DA26: flb_http_provider_create (src/aws/flb_aws_credentials_http.c:394)
==964174==    by 0x162EDD: test_http_validator_invalid_host (tests/internal/aws_credentials_http.c:727)
==964174==    by 0x1609B7: acutest_do_run_ (tests/internal/../lib/acutest/acutest.h:1034)
==964174==    by 0x15F68D: acutest_run_ (tests/internal/../lib/acutest/acutest.h:1205)
==964174==    by 0x15E329: main (tests/internal/../lib/acutest/acutest.h:1769)
==964174==
==964174== 212 (96 direct, 116 indirect) bytes in 1 blocks are definitely lost in loss record 8 of 9
==964174==    at 0x484D953: calloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==964174==    by 0x16D064: flb_calloc (include/fluent-bit/flb_mem.h:95)
==964174==    by 0x16D47F: flb_endpoint_provider_create (src/aws/flb_aws_credentials_http.c:265)
==964174==    by 0x16DA26: flb_http_provider_create (src/aws/flb_aws_credentials_http.c:394)
==964174==    by 0x163053: test_http_validator_invalid_port (tests/internal/aws_credentials_http.c:756)
==964174==    by 0x1609B7: acutest_do_run_ (tests/internal/../lib/acutest/acutest.h:1034)
==964174==    by 0x15F68D: acutest_run_ (tests/internal/../lib/acutest/acutest.h:1205)
==964174==    by 0x15E329: main (tests/internal/../lib/acutest/acutest.h:1769)
==964174==
==964174== 18,600 bytes in 1 blocks are definitely lost in loss record 9 of 9
==964174==    at 0x484D953: calloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==964174==    by 0x15F8C4: flb_calloc (include/fluent-bit/flb_mem.h:95)
==964174==    by 0x162E6F: test_http_validator_invalid_host (tests/internal/aws_credentials_http.c:722)
==964174==    by 0x1609B7: acutest_do_run_ (tests/internal/../lib/acutest/acutest.h:1034)
==964174==    by 0x15F68D: acutest_run_ (tests/internal/../lib/acutest/acutest.h:1205)
==964174==    by 0x15E329: main (tests/internal/../lib/acutest/acutest.h:1769)
==964174==
==964174== LEAK SUMMARY:
==964174==    definitely lost: 18,792 bytes in 3 blocks
==964174==    indirectly lost: 232 bytes in 6 blocks
==964174==      possibly lost: 0 bytes in 0 blocks
==964174==    still reachable: 0 bytes in 0 blocks
==964174==         suppressed: 0 bytes in 0 blocks
==964174==

just trying to speed up things, can you pls review the commits and cherry-pick them ?

@edsiper
Copy link
Member

edsiper commented Sep 2, 2024

moving this for 3.2. we need someone to incorporate the changes

@zhihonl
Copy link

zhihonl commented Dec 6, 2024

Hi @edsiper and @PettitWesley, I created a new PR merging both your changes and resolved the master branch merge conflicts in this PR: #9696. I tested the changes in EKS and verified that they work. Could you guys take a look?

If we prefer to keep the contributions in this PR, let me know. Unsure if I need to be granted any access to make changes to this PR or not if we go that route.

@iandrewt
Copy link
Contributor

iandrewt commented Dec 6, 2024

Honestly, kinda forgot to follow this one up. Things are slow in December at work, so I'll have some time to test this out on Monday Australia time.

@iandrewt
Copy link
Contributor

I've deployed @zhihonl's branch to a non production cluster this morning, no issues so far! S3 uploads are working fine. Will check again on Monday to see if anything pops up over the weekend.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants