change prefetch buffer size from 1024 -> 2048 bytes #248

elee1766 · 2024-10-01T22:02:09Z

No description provided.

mholt

This fixes a weird case with Hybrid Kyber KEM where the ClientHello is 1.7 KB, and the 1.0 KB buffer size was apparently causing the TLS matcher to not match, and thus causing the TLS connection to fail.

Thanks for the excellent debugging work, @elee1766 !

ydylla · 2024-10-03T13:57:36Z

This does not seam like the correct fix for failing tls matching. Matching should load more bytes if necessary.
Its more likely that the cause is the removal of loading as much client bytes as possible in prefetch that happened by afa78d7 and thus going multiple times through the matching loop. Which also matches the comments from #250 (the diff there basically restores the old behavior).
I am still using my version from #229 and have not noticed any issues with the large client hellos.
Increasing the chunk size here also doubled the minimum memory usage per connection for everything. Since it is also the minimum allocation size of the buffer. Not sure if we want this. But if now the majority of client hellos is bigger than 1KB it may also not be a bad thing. Tls matching is after all a very popular use case.

Monviech · 2024-10-03T16:51:33Z

@ydylla

I've tried it out and built it with the hash before afa78d7:
e491c44

And then also with your branch:
905fa81

None of these two versions fixed the behavior I had with caddy l4 in freebsd.

Though strangely even just 1 Millisecond sleep time exactly in this spot fixes it for me.
Adding this before "read once" fixes all of my problems with the tls matcher not matching on freebsd.

time.Sleep(1 * time.Millisecond)

caddy-l4/layer4/connection.go

Lines 140 to 141 in ea27408

    
           // read once 
        
           if len(cx.buf) < MaxMatchingBytes {

So, my issues might be unrelated and of a different nature. The gpt solution might have only given it some arbitrary delay too in that spot due to more processing.

I don't know the implications, but if adding 1 or 2ms there fixes it for most of the userbase of the plugin I downstream then I might just patch it in myself, since it seems kinda specific to one operating system.

elee1766 · 2024-10-03T18:05:59Z

This does not seam like the correct fix for failing tls matching. Matching should load more bytes if necessary.
Its more likely that the cause is the removal of loading as much client bytes as possible in prefetch that happened by afa78d7 and thus going multiple times through the matching loop. Which also matches the comments from #250 (the diff there basically restores the old behavior).
I am still using my version from #229 and have not noticed any issues with the large client hellos.
Increasing the chunk size here also doubled the minimum memory usage per connection for everything. Since it is also the minimum allocation size of the buffer. Not sure if we want this. But if now the majority of client hellos is bigger than 1KB it may also not be a bad thing. Tls matching is after all a very popular use case.

yes, I think prefetching beyond the initial buffer is broken somehow in this case, but I don't exactly know why.

this solution isn't correct, I agree. if the other bug is fixed we can consider reverting (although as you state, majority of client hellos will probably be this large, at least for serving browser users)

mholt · 2024-10-03T22:06:02Z

Though strangely even just 1 Millisecond sleep time exactly in this spot fixes it for me.
Adding this before "read once" fixes all of my problems with the tls matcher not matching on freebsd.

time.Sleep(1 * time.Millisecond)

That sounds like a race condition to me... not necessarily a "data race" in the strict sense of what the race detetor would find, but more like a logical race. Either that or something funky in the kernel that depends on time or order of things that aren't concurrent. But if something concurrent/parallel is going on, this is where I've seen Sleeps make a spooky difference.

But anyway, we should probably try to find what is going on and fix it better 🤔

ydylla · 2024-10-07T18:31:32Z

None of these two versions fixed the behavior I had with caddy l4 in freebsd.

I think there are two different issues at play here. First the FreeBSD one.
The first Read on FreeBSD is probably a 0 length read. We should see that in the logs. @Monviech You never posted debug logs of the unmodified broken version (before ea27408). Can you do that? We should see a "prefetched","bytes":0 line.
The 0 length read causes all versions of prefetch to exit without data. Which in turn causes the matching loop to be called a second time. Here is the second bug. If multiple iterations are needed for matching it does not work. I can also reproduce that on Windows at 4f012d4 (before the chunk size increase). This

config.json

{
  "admin": {
    "disabled": true
  },
  "logging": {
    "logs": {
      "default": {"level":"DEBUG", "encoder": {"format":"console"}}
    }
  },
  "apps": {
    "tls": {
      "certificates": {
        "automate": ["localhost"]
      },
      "automation": {
        "policies": [{
          "subjects": ["localhost"],
          "issuers": [{
            "module": "internal"
          }]
        }]
      }
    },
    "layer4": {
      "servers": {
        "https": {
          "listen": ["0.0.0.0:10443"],
          "routes": [
            {
              "match": [
                {"tls": {"sni": ["localhost"]}}
              ],
              "handle": [
                {"handler": "tls"}
              ]
            },
            {
              "handle": [
                {
                  "handler": "proxy",
                  "upstreams": [{"dial": ["127.0.0.1:10080"]}]
                }
              ]
            }
          ]
        }
      }
    },
    "http": {
      "servers": {
        "backend": {
          "protocols": ["h1","h2","h2c"],
          "listen": ["127.0.0.1:10080"],
          "routes": [
            {
              "handle": [{
                "handler": "static_response",
                "status_code": "200",
                "body": "Hello World\n",
                "headers": {
                  "Content-Type": ["text/plain"]
                }
              }]
            }
          ]
        }
      }
    }
  }
}

for example does not work with Chrome. It produces the following logs:

2024/10/07 18:11:22.983 DEBUG   layer4  matching        {"remote": "[::1]:11261", "error": "consumed all prefetched bytes", "matcher": "layer4.matchers.tls", "matched": false}
2024/10/07 18:11:22.983 DEBUG   layer4  prefetched      {"remote": "[::1]:11261", "bytes": 1024}
2024/10/07 18:11:22.983 DEBUG   layer4  matching        {"remote": "[::1]:11261", "error": "consumed all prefetched bytes", "matcher": "layer4.matchers.tls", "matched": false}
2024/10/07 18:11:22.984 DEBUG   layer4.handlers.proxy   dial upstream   {"remote": "[::1]:11261", "upstream": "127.0.0.1:10080"}
2024/10/07 18:11:22.984 DEBUG   layer4  connection stats        {"remote": "[::1]:11261", "read": 1726, "written": 103, "duration": 0.0005173}
2024/10/07 18:11:22.984 DEBUG   layer4  matching        {"remote": "[::1]:11263", "error": "consumed all prefetched bytes", "matcher": "layer4.matchers.tls", "matched": false}
2024/10/07 18:11:22.984 DEBUG   layer4  prefetched      {"remote": "[::1]:11263", "bytes": 1024}
2024/10/07 18:11:22.985 DEBUG   layer4  matching        {"remote": "[::1]:11263", "error": "consumed all prefetched bytes", "matcher": "layer4.matchers.tls", "matched": false}
2024/10/07 18:11:22.985 DEBUG   layer4.handlers.proxy   dial upstream   {"remote": "[::1]:11263", "upstream": "127.0.0.1:10080"}
2024/10/07 18:11:22.985 DEBUG   layer4  connection stats        {"remote": "[::1]:11263", "read": 1822, "written": 103, "duration": 0.0010873}

With Firefox it works. It also works with Chrome on the lazy-prefetch branch.

For the FreeBSD issue ignoring 0 length Read's and just trying to Read again is probably the better fix. In case my theory about that is actually true 😅

ydylla · 2024-10-08T17:20:56Z

It also occurred to me that the config I used above is inherently unreliable, because it uses a route without matcher as second route. That means once the first route fails layer4 will always choose the second one regardless of the first route indicating it needs more data. It is a config that should not really be used. In general one should avoid routes without matchers if it's not the only route. Eagerly loading all data the client sent in prefetch mitigates this, but it's not guaranteed to work.

Monviech · 2024-10-08T17:33:53Z

@ydylla I've built with 4f012d4 and removed the empty route and my config started working without the sleep in Chrome and Edge. I have added the empty route because it was suggested here:

#217

Check the SSH multiplexing detail.

I want to know if thats still required or why it was suggested. Since I maintain a template and I have hardcoded it there due to this example.

If I can remove it, I want to know if theres possible side effects if possible.

https://github.com/opnsense/plugins/blob/5b06e8af1bf832036af19be72ee1051249607933/www/caddy/src/opnsense/service/templates/OPNsense/Caddy/Caddyfile#L92-L102

ydylla · 2024-10-08T18:10:45Z

I want to know if thats still required or why it was suggested. Since I maintain a template and I have hardcoded it there due to this example.

The empty route that is suggested there should indeed not be necessary anymore. It was a workaround for #208 and fixed with afa78d7. I can not guarantee you there are no other side effects. The fallback implemented with afa78d7 has indeed other behavior. It is only called if all matchers came to a final conclusion.
In general, it's best to be explicit. If I read your template correctly you could also use a tls matcher. Because directly after the layer4 wrapper the tls wrapper is specified, so everything not looking like tls would fail there anyway.

ydylla · 2024-10-08T18:28:07Z

after the layer4 wrapper the tls wrapper is specified, so everything not looking like tls would fail there anyway

Which is apparently also not necessarily true. According to the docs the tls wrapper may be or is a noop:

There is a special no-op tls listener wrapper provided as a standard module which marks where TLS should be handled in the chain of listener wrappers. It should only be used if another listener wrapper must be placed in front of the TLS handshake. This does not enable TLS for a server; e.g. if this is used on your :80 HTTP server, it will still act as a no-op.

So using a tls machter may interfere with http -> https redirects. I have never used listener wrappers myself 🤷🏻‍♂️

Monviech · 2024-10-08T18:35:19Z

@ydylla Its a kinda specific but widely used usecase popularized by HA Proxy (I think).

People want to route some domains on their precious IPv4:443 port (since they dont want or can not use the benefits of almost unlimited IPv6 sockets) without TLS termination, and some of them with TLS termination and reverse proxied.

They also want to multiplex their SSH and RDP into 443 while still using the HTTP reverse proxy.

Thats why the listener wrapper seems to be quite important here.

But I could be wrong, actually never tried setting it up differently since I interpreted the configuration examples this way.

If theres a logic error I'm quite interested to find it.

ydylla · 2024-10-08T19:08:02Z

There is no logic error. One can use the listener wrapper for this. My point was I am not the best person to ask for advice for it. I still use configs where layer4 controls the socket and only forward http to the caddy http app listening on a unix socket. Because I find it more logical this way and since my configs are older than the listener wrapper.

Update connection.go

d2be2b9

elee1766 changed the title ~~Update connection.go~~ change prefetch buffer size from 1024 -> 2048 bytes Oct 1, 2024

mholt approved these changes Oct 1, 2024

View reviewed changes

mholt added the bug Something isn't working label Oct 1, 2024

mholt merged commit ea27408 into mholt:master Oct 1, 2024
6 checks passed

Monviech mentioned this pull request Oct 4, 2024

Mitigate TLS matching issues on FreeBSD when X25519Kyber768Draft00 is involved #252

Merged

Monviech mentioned this pull request Oct 26, 2024

www/caddy: Remove default route from layer4 since its obsolete opnsense/plugins#4323

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

change prefetch buffer size from 1024 -> 2048 bytes #248

change prefetch buffer size from 1024 -> 2048 bytes #248

elee1766 commented Oct 1, 2024

mholt left a comment •

edited

Loading

ydylla commented Oct 3, 2024

Monviech commented Oct 3, 2024

elee1766 commented Oct 3, 2024 •

edited

Loading

mholt commented Oct 3, 2024 •

edited

Loading

ydylla commented Oct 7, 2024

ydylla commented Oct 8, 2024

Monviech commented Oct 8, 2024 •

edited

Loading

ydylla commented Oct 8, 2024

ydylla commented Oct 8, 2024

Monviech commented Oct 8, 2024

ydylla commented Oct 8, 2024

change prefetch buffer size from 1024 -> 2048 bytes #248

change prefetch buffer size from 1024 -> 2048 bytes #248

Conversation

elee1766 commented Oct 1, 2024

mholt left a comment • edited Loading

Choose a reason for hiding this comment

ydylla commented Oct 3, 2024

Monviech commented Oct 3, 2024

elee1766 commented Oct 3, 2024 • edited Loading

mholt commented Oct 3, 2024 • edited Loading

ydylla commented Oct 7, 2024

ydylla commented Oct 8, 2024

Monviech commented Oct 8, 2024 • edited Loading

ydylla commented Oct 8, 2024

ydylla commented Oct 8, 2024

Monviech commented Oct 8, 2024

ydylla commented Oct 8, 2024

mholt left a comment •

edited

Loading

elee1766 commented Oct 3, 2024 •

edited

Loading

mholt commented Oct 3, 2024 •

edited

Loading

Monviech commented Oct 8, 2024 •

edited

Loading