Gremlin lambda can't DNS resolve the Neptune endpoint #554

mmigliari · 2024-10-10T19:28:07Z

Describe the bug
The gremlin discovery lambda is unable to resolve the Neptune DNS endpoint, failing with a getaddrinfo EAI_AGAIN <endpoint_address> error.

To Reproduce
Launch the stacks as per the documentation and wait for the ECS scheduled task to fire up the lambda. The errors can be seen on the lambda cloudwatch logs with a timeout and the getaddrinfo EAI_AGAIN <endpoint_address> error.

Expected behavior
The lambda, which is inside the VPC, should be able to resolve to using the DNS servers defined in the VPC dhcp option set.

Additional context
This may be necessary in VPC setups with non-standard DNS settings.

Solution
Open outbound UDP port 53 (DNS resolution) access to the lambda for the VPC CIDR range for DHCP options sets with DNS servers hosted in the VPC

The text was updated successfully, but these errors were encountered:

svozza · 2024-10-10T21:05:41Z

Thanks for raising this so we can track it! At the very least we should document this in the troubleshooting guide.

mmigliari · 2024-10-11T00:00:51Z

I made a PR to include an outbout rule on the gremlin lambda security group to allow for UDP port 53 access to the VPC CIDR range. This assunes any DNS servers in the VPC DHCP option set are set in the VPC CIDR range.

One alternative would be to ask for the DNS servers to be used, if they are not standard, during the CloudFormation template launch. If they are added, then just add outbound UDP port 53 access to those.

svozza · 2024-11-22T15:20:28Z

I've been thinking about this and adding more CFN parameters for this scenario is something I want to avoid. With the release of v2.2.0 on Wednesday, the solution now has 33 CFN parameters and I'd rather not add more for an edge case like this. What I will do though is add a new section to the troubleshooting guide (as I mentioned earlier) to explain how to fix this error: https://docs.aws.amazon.com/solutions/latest/workload-discovery-on-aws/troubleshooting.html.

mmigliari added the bug Something isn't working label Oct 10, 2024

mmigliari mentioned this issue Oct 10, 2024

fix: open gremlin lambda to outbound DNS resolution access in vpc cid… #555

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gremlin lambda can't DNS resolve the Neptune endpoint #554

Gremlin lambda can't DNS resolve the Neptune endpoint #554

mmigliari commented Oct 10, 2024

svozza commented Oct 10, 2024 •

edited

Loading

mmigliari commented Oct 11, 2024

svozza commented Nov 22, 2024

Gremlin lambda can't DNS resolve the Neptune endpoint #554

Gremlin lambda can't DNS resolve the Neptune endpoint #554

Comments

mmigliari commented Oct 10, 2024

svozza commented Oct 10, 2024 • edited Loading

mmigliari commented Oct 11, 2024

svozza commented Nov 22, 2024

svozza commented Oct 10, 2024 •

edited

Loading