Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gremlin lambda can't DNS resolve the Neptune endpoint #554

Open
mmigliari opened this issue Oct 10, 2024 · 3 comments
Open

Gremlin lambda can't DNS resolve the Neptune endpoint #554

mmigliari opened this issue Oct 10, 2024 · 3 comments
Labels
bug Something isn't working

Comments

@mmigliari
Copy link

Describe the bug
The gremlin discovery lambda is unable to resolve the Neptune DNS endpoint, failing with a getaddrinfo EAI_AGAIN <endpoint_address> error.

To Reproduce
Launch the stacks as per the documentation and wait for the ECS scheduled task to fire up the lambda. The errors can be seen on the lambda cloudwatch logs with a timeout and the getaddrinfo EAI_AGAIN <endpoint_address> error.

Expected behavior
The lambda, which is inside the VPC, should be able to resolve to using the DNS servers defined in the VPC dhcp option set.

Additional context
This may be necessary in VPC setups with non-standard DNS settings.

Solution
Open outbound UDP port 53 (DNS resolution) access to the lambda for the VPC CIDR range for DHCP options sets with DNS servers hosted in the VPC

@svozza
Copy link
Contributor

svozza commented Oct 10, 2024

Thanks for raising this so we can track it! At the very least we should document this in the troubleshooting guide.

@mmigliari
Copy link
Author

I made a PR to include an outbout rule on the gremlin lambda security group to allow for UDP port 53 access to the VPC CIDR range. This assunes any DNS servers in the VPC DHCP option set are set in the VPC CIDR range.

One alternative would be to ask for the DNS servers to be used, if they are not standard, during the CloudFormation template launch. If they are added, then just add outbound UDP port 53 access to those.

@svozza
Copy link
Contributor

svozza commented Nov 22, 2024

I've been thinking about this and adding more CFN parameters for this scenario is something I want to avoid. With the release of v2.2.0 on Wednesday, the solution now has 33 CFN parameters and I'd rather not add more for an edge case like this. What I will do though is add a new section to the troubleshooting guide (as I mentioned earlier) to explain how to fix this error: https://docs.aws.amazon.com/solutions/latest/workload-discovery-on-aws/troubleshooting.html.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants