Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DNS-over-TLS error "EOF" #276

Open
aleho opened this issue Dec 13, 2024 · 4 comments
Open

DNS-over-TLS error "EOF" #276

aleho opened this issue Dec 13, 2024 · 4 comments

Comments

@aleho
Copy link

aleho commented Dec 13, 2024

I'm trying to get DNS-over-TLS (DoT) running with the l4 module with the following config.

After some tinkering I'm stuck at TLS termination and getting a response from upstream, unless its "pure" DNS. What am I missing?

(I hope it's OK to ask here.)

{
	layer4 {
		:853 {
			@dot tls sni dns.localhost
			route @dot {
				tls {
					connection_policy {
						alpn dot
					}
				}
				proxy tcp/10.0.0.2:53
			}
		}
		udp/:53 {
			route {
				proxy udp/10.0.0.2:53
			}
		}
	}

	log {
		level DEBUG
	}
}

https://dns.localhost {
}

HTTPS is good

curl -i --cacert data/caddy/pki/authorities/local/intermediate.crt https://dns.localhost

HTTP/2 200 
alt-svc: h3=":443"; ma=2592000
server: Caddy
content-length: 0
date: Fri, 13 Dec 2024 22:25:57 GMT

DNS works

dig @127.0.0.1 -p1053 caddyserver.com

; <<>> DiG 9.20.2-1-Debian <<>> @127.0.0.1 -p1053 caddyserver.com
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 33767
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 1232
;; QUESTION SECTION:
;caddyserver.com.		IN	A

;; ANSWER SECTION:
caddyserver.com.	253	IN	A	165.227.20.207

;; Query time: 16 msec
;; SERVER: 127.0.0.1#1053(127.0.0.1) (UDP)
;; WHEN: Sun Dec 15 19:59:33 CET 2024
;; MSG SIZE  rcvd: 60
2024/12/15 19:00:42.522	DEBUG	layer4.handlers.proxy	dial upstream	{"remote": "172.17.0.1:47889", "upstream": "10.0.1.3:53"}

DNS-over-TLS returns an error

dig fails (without output)

dig -d @127.0.0.1 +tls +tls-ca=data/caddy/pki/authorities/local/root.crt +tls-hostname=dns.localhost caddyserver.com

setup_libs()
setup_system()
create_search_list()
ndots is 1.
timeout is 0.
retries is 3.
get_server_list()
make_server(10.0.1.3)
make_server(fdc1:4729:dc06::3)
dig_query_setup
parse_args()
making new lookup
make_empty_lookup()
make_empty_lookup() = 0x7fb6aa232000->references = 1
digrc (open)
main parsing -d
main parsing @127.0.0.1
make_server(127.0.0.1)
main parsing +tls
main parsing +tls-ca=data/caddy/pki/authorities/local/root.crt
main parsing +tls-hostname=dns.localhost
main parsing caddyserver.com
clone_lookup()
make_empty_lookup()
make_empty_lookup() = 0x7fb6aa233800->references = 1
clone_server_list()
make_server(127.0.0.1)
looking up caddyserver.com
dig_startup()
start_lookup()
setup_lookup(0x7fb6aa233800)
resetting lookup counter.
idn_textname: caddyserver.com
using root origin
recursive query
AD query
add_question()
starting to render the message
add_opt()
done rendering
create query 0x7fb6aaa60280 linked to lookup 0x7fb6aa233800
dighost.c:2146:lookup_attach(0x7fb6aa233800) = 2
dighost.c:2657:new_query(0x7fb6aaa60280) = 1
do_lookup()
start_tcp(0x7fb6aaa60280)
dighost.c:2937:query_attach(0x7fb6aaa60280) = 2
query->servname = 127.0.0.1

dighost.c:3041:query_attach(0x7fb6aaa60280) = 3
tcp_connected()
tcp_connected(0x7fb6aa21dd00, operation canceled, 0x7fb6aaa60280)
dighost.c:3575:lookup_attach(0x7fb6aa233800) = 3
in cancel handler
dighost.c:3595:_cancel_lookup()
canceling pending query 0x7fb6aaa60280, belonging to 0x7fb6aa233800
dighost.c:1734:query_detach(0x7fb6aaa60280) = 2
dighost.c:2780:query_detach(0x7fb6aaa60280) = 1
check_if_done()
list empty
dighost.c:3598:query_detach(0x7fb6aaa60280) = 0
dighost.c:3598:destroy_query(0x7fb6aaa60280) = 0
dighost.c:1692:lookup_detach(0x7fb6aa233800) = 2
dighost.c:3599:lookup_detach(0x7fb6aa233800) = 1
clear_current_lookup()
lookup cleared
dighost.c:1825:lookup_detach(0x7fb6aa233800) = 0
destroy_lookup
freeing server 0x7fb6aaa6ee00 belonging to 0x7fb6aa233800
start_lookup()
check_if_done()
list empty
shutting down
shutdown
destroy_lookup
freeing server 0x7fb6aaa6d000 belonging to 0x7fb6aa232000
cancel_all()
destroy_libs()
flush_server_list()
destroy DST lib
Removing log context
Destroy memory
2024/12/15 19:06:50.420	DEBUG	layer4	matching	{"remote": "172.17.0.1:55140", "error": "consumed all prefetched bytes", "matcher": "layer4.matchers.tls", "matched": false}
2024/12/15 19:06:50.421	DEBUG	layer4	prefetched	{"remote": "172.17.0.1:55140", "bytes": 318}
2024/12/15 19:06:50.421	DEBUG	layer4	matching	{"remote": "172.17.0.1:55140", "matcher": "layer4.matchers.tls", "matched": false}
2024/12/15 19:06:50.421	DEBUG	layer4	connection stats	{"remote": "172.17.0.1:55140", "read": 318, "written": 0, "duration": 0.00118827}

kdig provides more info and causes EOF

kdig -d @127.0.0.1:853 +tls-ca=data/caddy/pki/authorities/local/root.crt +tls-hostname=dns.localhost caddyserver.com

;; DEBUG: Querying for owner(caddyserver.com.), class(1), type(1), server(127.0.0.1), port(853), protocol(TCP)
;; DEBUG: TLS, imported 1 certificates from 'data/caddy/pki/authorities/local/root.crt'
;; DEBUG: TLS, received certificate hierarchy:
;; WARNING: TLS, handshake failed (The requested data were not available.)
;; ERROR: failed to query server 127.0.0.1@853(TCP)
2024/12/15 19:02:57.290	DEBUG	layer4	matching	{"remote": "172.17.0.1:47358", "error": "consumed all prefetched bytes", "matcher": "layer4.matchers.tls", "matched": false}
2024/12/15 19:02:57.290	DEBUG	layer4	prefetched	{"remote": "172.17.0.1:47358", "bytes": 337}
2024/12/15 19:02:57.290	DEBUG	layer4.matchers.tls	matched	{"remote": "172.17.0.1:47358", "server_name": "dns.localhost"}
2024/12/15 19:02:57.290	DEBUG	layer4	matching	{"remote": "172.17.0.1:47358", "matcher": "layer4.matchers.tls", "matched": true}
2024/12/15 19:02:57.291	DEBUG	events	event	{"name": "tls_get_certificate", "id": "924f79b7-0b8e-4d83-9a69-f16d3ad42b68", "origin": "tls", "data": {"client_hello":{"CipherSuites":[4866,4867,4865,4868],"ServerName":"dns.localhost","SupportedCurves":[29,23,24,25],"SupportedPoints":"AA==","SignatureSchemes":[1025,2057,2052,1027,2055,1281,2058,2053,1283,2056,1537,2059,2054,1539,513,515],"SupportedProtos":["dot"],"SupportedVersions":[772],"RemoteAddr":{"IP":"172.17.0.1","Port":47358,"Zone":""},"LocalAddr":{"IP":"172.17.0.2","Port":853,"Zone":""}}}}
2024/12/15 19:02:57.291	DEBUG	tls.handshake	choosing certificate	{"identifier": "dns.localhost", "num_choices": 1}
2024/12/15 19:02:57.291	DEBUG	tls.handshake	default certificate selection results	{"identifier": "dns.localhost", "subjects": ["dns.localhost"], "managed": true, "issuer_key": "local", "hash": "21e3098439dca80ec98015ff690979e6d7bfdad1583e6a185f62aed8a2cb8515"}
2024/12/15 19:02:57.291	DEBUG	tls.handshake	matched certificate in cache	{"remote_ip": "172.17.0.1", "remote_port": "47358", "subjects": ["dns.localhost"], "managed": true, "expiration": "2024/12/15 23:43:50.000", "hash": "21e3098439dca80ec98015ff690979e6d7bfdad1583e6a185f62aed8a2cb8515"}
2024/12/15 19:02:57.294	ERROR	layer4	handling connection	{"remote": "172.17.0.1:47358", "error": "EOF"}
2024/12/15 19:02:57.294	DEBUG	layer4	connection stats	{"remote": "172.17.0.1:47358", "read": 343, "written": 1428, "duration": 0.004031283}

Comparing to nginx

A comparable nginx setup with ngx_stream_core_module works. I had to add the root certificate to dns.localhost.crt though, only the intermediate would not work.

stream {
    log_format main 'STREAM $remote_addr [$time_local] '
                 '$protocol $status $bytes_sent $bytes_received '
                 '$session_time';

    access_log  /var/log/nginx/access.log  main;

    upstream dns-servers {
        server 10.0.1.3:53;
    }

    server {
        listen 0.0.0.0:853 ssl;
        server_name dns.localhost;

        ssl_certificate         /data/caddy/certificates/local/dns.localhost/dns.localhost.full.crt;
        ssl_certificate_key     /data/caddy/certificates/local/dns.localhost/dns.localhost.key;

        ssl_protocols             TLSv1.2 TLSv1.3;
        ssl_ciphers               HIGH:!aNULL:!MD5;
        ssl_prefer_server_ciphers on;
        ssl_session_cache         shared:DNS:60m;
        ssl_session_timeout       4h;
        ssl_handshake_timeout     10s;

        proxy_pass dns-servers;
    }
}

dig @127.0.0.1 +tls +tls-ca=data/caddy/pki/authorities/local/root.crt +tls-hostname=dns.localhost caddyserver.com

2024/12/15 19:09:04 [info] 32#32: *5 client 172.17.0.1:58386 connected to 0.0.0.0:853
2024/12/15 19:09:04 [info] 32#32: *5 proxy 172.17.0.2:36916 connected to 10.0.1.3:53
2024/12/15 19:09:04 [info] 32#32: *5 client disconnected, bytes from/to client:58/62, bytes from/to upstream:62/58
STREAM 172.17.0.1 [15/Dec/2024:19:09:04 +0000] TCP 200 62 58 0.019

To make it easier to reproduce, here's how I'm running Caddy.

Dockerfile

FROM caddy:2-builder AS builder

RUN xcaddy build \
    --with github.com/mholt/caddy-l4


FROM caddy:2

COPY --from=builder /usr/bin/caddy /usr/bin/caddy

EXPOSE 53 853 443

Build and run

docker build . -t catls &&
docker run \
    --rm \
    -it \
    --name=catls \
    --volume=./Caddyfile:/etc/caddy/Caddyfile \
    --volume=./data:/data \
    --publish=127.0.0.1:853:853 \
    --publish=127.0.0.1:1053:53/udp \
    --publish=127.0.0.1:443:443 \
    catls
@vnxme
Copy link
Collaborator

vnxme commented Dec 16, 2024

Hi,

I copied your config with minor changes in terms of SNI and upstreams:

{
	layer4 {
		:853 {
			@dot tls sni localhost
			route @dot {
				tls {
					# also works without the following 3 lines
					connection_policy {
						alpn dot
					}
				}
				proxy tcp/1.1.1.1:53
			}
		}
		udp/:53 {
			route {
				proxy udp/1.1.1.1:53
			}
		}
	}

	log {
		level DEBUG
	}
}

https://localhost {
}

DoT works perfectly with the q client on Windows as follows:

>q.exe /i @tls://localhost:853 A caddyserver.com

DoT also works fine with the dig client on Linux as follows:

$ dig @[CADDY_IP_ADDR] +tls +tls-hostname=localhost +notls-ca caddyserver.com

I don't know if these is any difference in DoT handling by dig, kdig and q. But it seems +notls-ca is the option that makes the dig client accept self-signed certificates.

@aleho
Copy link
Author

aleho commented Dec 17, 2024

That's really strange. I get exactly the same behavior when specifying "+notls-ca" in both.

What does dig -v report? For me it's DiG 9.18.30.

@vnxme
Copy link
Collaborator

vnxme commented Dec 17, 2024

$ dig -v
DiG 9.18.28-1~deb12u2-Debian

What if you try the q client on Linux?

@aleho
Copy link
Author

aleho commented Dec 18, 2024

I also tried dnslookup and q. The latter works, but only if I disable certificate validation completely or import the CA certificates into my local store.

I'm not sure whether it's a problem with caddy-l4 or the various tools not implementing TLS handshake (or whatever) correctly. Nevertheless as it works with nginx and dig when providing or importing the CA cert I suspect Caddy not providing the correct data here? Unfortunately I know almost nothing about TLS, so where do I start debugging?

Something I noticed with CA certificates being trusted: apparently in SNI the hostname is never sent with dig.

dig @dns.localhost +tls caddyserver.com:

2024/12/18 20:44:07.716	DEBUG	layer4	matching	{"remote": "172.17.0.1:58192", "matcher": "layer4.matchers.tls", "matched": false}

./q -v @tls://dns.localhost caddyserver.com:

2024/12/18 20:44:23.875	DEBUG	layer4.matchers.tls	matched	{"remote": "172.17.0.1:48484", "server_name": "dns.localhost"}
2024/12/18 20:44:23.875	DEBUG	layer4	matching	{"remote": "172.17.0.1:48484", "matcher": "layer4.matchers.tls", "matched": true}

Without SNI matching the dig command above results in:

2024/12/18 20:52:56.524	DEBUG	events	event	{"name": "tls_get_certificate", "id": "aa15e2ec-b489-4e67-a3c1-c8e0cab60b16", "origin": "tls", "data": {"client_hello":{"CipherSuites":[4866,4867,4865,4868,49196,49200,52393,52392,49325,49195,49199,49324,49187,49191,49162,49172,49161,49171,157,49309,156,49308,61,60,53,47,159,52394,49311,158,49310,107,103,57,51,255],"ServerName":"","SupportedCurves":[29,23,30,25,24,256,257,258,259,260],"SupportedPoints":"AAEC","SignatureSchemes":[1027,1283,1539,2055,2056,2057,2058,2059,2052,2053,2054,1025,1281,1537,771,769],"SupportedProtos":["dot"],"SupportedVersions":[772,771],"RemoteAddr":{"IP":"172.17.0.1","Port":36912,"Zone":""},"LocalAddr":{"IP":"172.17.0.2","Port":853,"Zone":""}}}}
2024/12/18 20:52:56.524	DEBUG	tls.handshake	no matching certificates and no custom selection logic	{"identifier": "172.17.0.2"}
2024/12/18 20:52:56.524	DEBUG	tls.handshake	no certificate matching TLS ClientHello	{"remote_ip": "172.17.0.1", "remote_port": "36912", "server_name": "", "remote": "172.17.0.1:36912", "identifier": "172.17.0.2", "cipher_suites": [4866, 4867, 4865, 4868, 49196, 49200, 52393, 52392, 49325, 49195, 49199, 49324, 49187, 49191, 49162, 49172, 49161, 49171, 157, 49309, 156, 49308, 61, 60, 53, 47, 159, 52394, 49311, 158, 49310, 107, 103, 57, 51, 255], "cert_cache_fill": 0.0001, "load_or_obtain_if_necessary": true, "on_demand": false}
2024/12/18 20:52:56.524	ERROR	layer4	handling connection	{"remote": "172.17.0.1:36912", "error": "no certificate available for '172.17.0.2'"}
2024/12/18 20:52:56.524	DEBUG	layer4	connection stats	{"remote": "172.17.0.1:36912", "read": 312, "written": 7, "duration": 0.001190452}

So apparently dig and kdig don't send any hostname?

I'll have a look at https://datatracker.ietf.org/doc/html/rfc7858 and try to understand whether DoT is supposed to support SNI.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants