Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DnssdServer ResponseTimeout #2593

Open
ajhemphill91 opened this issue Nov 11, 2024 · 3 comments
Open

DnssdServer ResponseTimeout #2593

ajhemphill91 opened this issue Nov 11, 2024 · 3 comments

Comments

@ajhemphill91
Copy link

Sometime after 697cb48, thread FTDs are no longer able to resolve DNS of a hostname on the adjacent ethernet network successfully. To be clear, 697cb48 works fine, but the latest commit does not.

I have a Raspberry Pi 5 with RCP connected via SPI running a docker compose stack consisting of an otbr container and a mosquitto broker container, which are networked appropriately. The FTDs are commissioned via the otbr container and attempt to use otDnsClientResolveIp4Address to resolve the mosquitto service name so they can connect to MQTT. This has been working fine for a while.

After running a clean build of the otbr container recently, the FTDs can no longer find mosquitto:
ot-br-posix_main.log

The FTDs return errno 28 (ETIMEOUT) when trying to resolve DNS. Equivalent with ot dns resolve4 hostname.

I jumped back a couple months on ot-br-posix to the aforementioned commit in a new clean docker build and it works again as expected. This was the only change made.
ot-br-posix_697cb48.log

As expected, the FTDs can resolve the hostname.

Additional details:
Raspberry Pi 5, nrf52840DK RCP connected via SPI
FTDs are nrf52840DK running proprietary software, though the openthread CLI is sufficient to check the problem with ot dns resolve4 hostname

Network is:
FTD <---> Pi 5 [ OTBR Docker <--Docker VLAN--> Mosquitto Broker ]

@superwhd
Copy link
Contributor

superwhd commented Nov 19, 2024

Probably this is caused my previous PR: openthread/openthread#10864.

If you build with OPENTHREAD_POSIX_CONFIG_UPSTREAM_DNS_BIND_TO_INFRA_NETIF set to 0 then it should have exactly the same behavior as the older code.

The PR breaks your case because previously it will send the DNS query to the first network interface which has the route to the DNS server, but now it will sends the query to the infra network interface specified by the -B launch argument of otbr-agent. May I know what interfaces are available on your OTBR docker? Have you specified -B option correctly?

@ajhemphill91
Copy link
Author

ajhemphill91 commented Nov 19, 2024

Thanks for the response!

I tried setting OPENTHREAD_POSIX_CONFIG_UPSTREAM_DNS_BIND_TO_INFRA_NETIF to 0 by modifying my Dockerfile (copied from ot-br-posix and modified for SPI) to have:

ENV OTBR_OPTIONS=${OTBR_OPTIONS:-"-DOT_POSIX_RCP_SPI_BUS=ON -DOPENTHREAD_POSIX_CONFIG_UPSTREAM_DNS_BIND_TO_INFRA_NETIF=0"}

Building at latest ot-br-posix with that has no effect it would seem. I am still getting the DNS query timeouts. I'm not 100% sure I'm setting that flag right though.

The log is much the same as before:
ot-br-posix_issue2593_1.log

As I understand it, the dockerfile should also be setting the -B flag by way of INFRA_IF_NAME correct? I do have the following entries for all 3 of the logs provided so far (working and not working):

Nov 19 22:19:07 f6d3e1397017 otbr-agent: [NOTE]-AGENT---: Backbone interface: eth0
Nov 19 22:19:07 f6d3e1397017 otbr-agent[156]: [NOTE]-AGENT---: Running 0.3.0-thread-reference-20230710-471-gb4cfa2ffa25
Nov 19 22:19:07 f6d3e1397017 otbr-agent[156]: [NOTE]-AGENT---: Thread version: 1.4.0
Nov 19 22:19:07 f6d3e1397017 otbr-agent[156]: [NOTE]-AGENT---: Thread interface: wpan0
Nov 19 22:19:07 f6d3e1397017 otbr-agent[156]: [NOTE]-AGENT---: Radio URL: spinel+spi:///dev/spidev0.1?gpio-int-device=/dev/gpiochip0&gpio-int-line=20&gpio-reset-device=/dev/gpiochip0&gpio-reset-line=21&spi-speed=1000000
Nov 19 22:19:07 f6d3e1397017 otbr-agent[156]: [NOTE]-ILS-----: Infra link selected: eth0

A little more context, the docker container is attached to two docker networks in my case. This gives me lo, wpan0, eth0, and eth1 inside the container. eth0 should be the correct backbone interface to reach the MQTT container I'm running DNS queries for on my FTDs. The other network is my reverse proxy / ingress for the docker compose stack just so I can reach otbr web and the otbr REST API.

@superwhd
Copy link
Contributor

superwhd commented Nov 21, 2024

I think adding -DOPENTHREAD_POSIX_CONFIG_UPSTREAM_DNS_BIND_TO_INFRA_NETIF=0 to OTBR_OPTIONS may not work as you expected, because OTBR_OPTIONS expects CMake options while OPENTHREAD_POSIX_CONFIG_UPSTREAM_DNS_BIND_TO_INFRA_NETIF is a C macro which is not visible to CMake.

Maybe you can try just changing this line to define it as zero.

This gives me lo, wpan0, eth0, and eth1 inside the container. eth0 should be the correct backbone interface to reach the MQTT container I'm running DNS queries for on my FTDs. The other network is my reverse proxy / ingress for the docker compose stack just so I can reach otbr web and the otbr REST API.

It seems correct to specify -B eth0 according to your topology and otbr-agent was configured correctly.

The issue is that when otbr-agent's UDP socket is bound to eth0, the DNS query failed to get transmitted to the DNS server. When it's not bound, it can reach the DNS server.

Suggestion:

  • Check the /etc/resolv.conf file for the IPv4 DNS server addresses, see if they are reachable using the eth0 network interface. E.g. ip route get <DNS server address> oif eth0. A possible case is that your container uses 127.0.0.11 as the DNS server so it's not accessible via eth0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants