Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

change prefetch buffer size from 1024 -> 2048 bytes #248

Merged
merged 1 commit into from
Oct 1, 2024

Conversation

elee1766
Copy link
Contributor

@elee1766 elee1766 commented Oct 1, 2024

No description provided.

@elee1766 elee1766 changed the title Update connection.go change prefetch buffer size from 1024 -> 2048 bytes Oct 1, 2024
Copy link
Owner

@mholt mholt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fixes a weird case with Hybrid Kyber KEM where the ClientHello is 1.7 KB, and the 1.0 KB buffer size was apparently causing the TLS matcher to not match, and thus causing the TLS connection to fail.

Thanks for the excellent debugging work, @elee1766 !

@mholt mholt added the bug Something isn't working label Oct 1, 2024
@mholt mholt merged commit ea27408 into mholt:master Oct 1, 2024
6 checks passed
@ydylla
Copy link
Collaborator

ydylla commented Oct 3, 2024

This does not seam like the correct fix for failing tls matching. Matching should load more bytes if necessary.
Its more likely that the cause is the removal of loading as much client bytes as possible in prefetch that happened by afa78d7 and thus going multiple times through the matching loop. Which also matches the comments from #250 (the diff there basically restores the old behavior).
I am still using my version from #229 and have not noticed any issues with the large client hellos.
Increasing the chunk size here also doubled the minimum memory usage per connection for everything. Since it is also the minimum allocation size of the buffer. Not sure if we want this. But if now the majority of client hellos is bigger than 1KB it may also not be a bad thing. Tls matching is after all a very popular use case.

@Monviech
Copy link

Monviech commented Oct 3, 2024

@ydylla

I've tried it out and built it with the hash before afa78d7:
e491c44

And then also with your branch:
905fa81

None of these two versions fixed the behavior I had with caddy l4 in freebsd.

Though strangely even just 1 Millisecond sleep time exactly in this spot fixes it for me.
Adding this before "read once" fixes all of my problems with the tls matcher not matching on freebsd.

time.Sleep(1 * time.Millisecond)

// read once
if len(cx.buf) < MaxMatchingBytes {

So, my issues might be unrelated and of a different nature. The gpt solution might have only given it some arbitrary delay too in that spot due to more processing.

I don't know the implications, but if adding 1 or 2ms there fixes it for most of the userbase of the plugin I downstream then I might just patch it in myself, since it seems kinda specific to one operating system.

@elee1766
Copy link
Contributor Author

elee1766 commented Oct 3, 2024

This does not seam like the correct fix for failing tls matching. Matching should load more bytes if necessary.
Its more likely that the cause is the removal of loading as much client bytes as possible in prefetch that happened by afa78d7 and thus going multiple times through the matching loop. Which also matches the comments from #250 (the diff there basically restores the old behavior).
I am still using my version from #229 and have not noticed any issues with the large client hellos.
Increasing the chunk size here also doubled the minimum memory usage per connection for everything. Since it is also the minimum allocation size of the buffer. Not sure if we want this. But if now the majority of client hellos is bigger than 1KB it may also not be a bad thing. Tls matching is after all a very popular use case.

yes, I think prefetching beyond the initial buffer is broken somehow in this case, but I don't exactly know why.

this solution isn't correct, I agree. if the other bug is fixed we can consider reverting (although as you state, majority of client hellos will probably be this large, at least for serving browser users)

@mholt
Copy link
Owner

mholt commented Oct 3, 2024

Though strangely even just 1 Millisecond sleep time exactly in this spot fixes it for me.
Adding this before "read once" fixes all of my problems with the tls matcher not matching on freebsd.

time.Sleep(1 * time.Millisecond)

That sounds like a race condition to me... not necessarily a "data race" in the strict sense of what the race detetor would find, but more like a logical race. Either that or something funky in the kernel that depends on time or order of things that aren't concurrent. But if something concurrent/parallel is going on, this is where I've seen Sleeps make a spooky difference.

But anyway, we should probably try to find what is going on and fix it better 🤔

@ydylla
Copy link
Collaborator

ydylla commented Oct 7, 2024

None of these two versions fixed the behavior I had with caddy l4 in freebsd.

I think there are two different issues at play here. First the FreeBSD one.
The first Read on FreeBSD is probably a 0 length read. We should see that in the logs. @Monviech You never posted debug logs of the unmodified broken version (before ea27408). Can you do that? We should see a "prefetched","bytes":0 line.
The 0 length read causes all versions of prefetch to exit without data. Which in turn causes the matching loop to be called a second time. Here is the second bug. If multiple iterations are needed for matching it does not work. I can also reproduce that on Windows at 4f012d4 (before the chunk size increase). This

config.json
{
  "admin": {
    "disabled": true
  },
  "logging": {
    "logs": {
      "default": {"level":"DEBUG", "encoder": {"format":"console"}}
    }
  },
  "apps": {
    "tls": {
      "certificates": {
        "automate": ["localhost"]
      },
      "automation": {
        "policies": [{
          "subjects": ["localhost"],
          "issuers": [{
            "module": "internal"
          }]
        }]
      }
    },
    "layer4": {
      "servers": {
        "https": {
          "listen": ["0.0.0.0:10443"],
          "routes": [
            {
              "match": [
                {"tls": {"sni": ["localhost"]}}
              ],
              "handle": [
                {"handler": "tls"}
              ]
            },
            {
              "handle": [
                {
                  "handler": "proxy",
                  "upstreams": [{"dial": ["127.0.0.1:10080"]}]
                }
              ]
            }
          ]
        }
      }
    },
    "http": {
      "servers": {
        "backend": {
          "protocols": ["h1","h2","h2c"],
          "listen": ["127.0.0.1:10080"],
          "routes": [
            {
              "handle": [{
                "handler": "static_response",
                "status_code": "200",
                "body": "Hello World\n",
                "headers": {
                  "Content-Type": ["text/plain"]
                }
              }]
            }
          ]
        }
      }
    }
  }
}
for example does not work with Chrome. It produces the following logs:
2024/10/07 18:11:22.983 DEBUG   layer4  matching        {"remote": "[::1]:11261", "error": "consumed all prefetched bytes", "matcher": "layer4.matchers.tls", "matched": false}
2024/10/07 18:11:22.983 DEBUG   layer4  prefetched      {"remote": "[::1]:11261", "bytes": 1024}
2024/10/07 18:11:22.983 DEBUG   layer4  matching        {"remote": "[::1]:11261", "error": "consumed all prefetched bytes", "matcher": "layer4.matchers.tls", "matched": false}
2024/10/07 18:11:22.984 DEBUG   layer4.handlers.proxy   dial upstream   {"remote": "[::1]:11261", "upstream": "127.0.0.1:10080"}
2024/10/07 18:11:22.984 DEBUG   layer4  connection stats        {"remote": "[::1]:11261", "read": 1726, "written": 103, "duration": 0.0005173}
2024/10/07 18:11:22.984 DEBUG   layer4  matching        {"remote": "[::1]:11263", "error": "consumed all prefetched bytes", "matcher": "layer4.matchers.tls", "matched": false}
2024/10/07 18:11:22.984 DEBUG   layer4  prefetched      {"remote": "[::1]:11263", "bytes": 1024}
2024/10/07 18:11:22.985 DEBUG   layer4  matching        {"remote": "[::1]:11263", "error": "consumed all prefetched bytes", "matcher": "layer4.matchers.tls", "matched": false}
2024/10/07 18:11:22.985 DEBUG   layer4.handlers.proxy   dial upstream   {"remote": "[::1]:11263", "upstream": "127.0.0.1:10080"}
2024/10/07 18:11:22.985 DEBUG   layer4  connection stats        {"remote": "[::1]:11263", "read": 1822, "written": 103, "duration": 0.0010873}

With Firefox it works. It also works with Chrome on the lazy-prefetch branch.

For the FreeBSD issue ignoring 0 length Read's and just trying to Read again is probably the better fix. In case my theory about that is actually true 😅

@ydylla
Copy link
Collaborator

ydylla commented Oct 8, 2024

It also occurred to me that the config I used above is inherently unreliable, because it uses a route without matcher as second route. That means once the first route fails layer4 will always choose the second one regardless of the first route indicating it needs more data. It is a config that should not really be used. In general one should avoid routes without matchers if it's not the only route. Eagerly loading all data the client sent in prefetch mitigates this, but it's not guaranteed to work.

@Monviech
Copy link

Monviech commented Oct 8, 2024

@ydylla I've built with 4f012d4 and removed the empty route and my config started working without the sleep in Chrome and Edge. I have added the empty route because it was suggested here:

#217

Check the SSH multiplexing detail.

I want to know if thats still required or why it was suggested. Since I maintain a template and I have hardcoded it there due to this example.

If I can remove it, I want to know if theres possible side effects if possible.

https://github.com/opnsense/plugins/blob/5b06e8af1bf832036af19be72ee1051249607933/www/caddy/src/opnsense/service/templates/OPNsense/Caddy/Caddyfile#L92-L102

@ydylla
Copy link
Collaborator

ydylla commented Oct 8, 2024

I want to know if thats still required or why it was suggested. Since I maintain a template and I have hardcoded it there due to this example.

The empty route that is suggested there should indeed not be necessary anymore. It was a workaround for #208 and fixed with afa78d7. I can not guarantee you there are no other side effects. The fallback implemented with afa78d7 has indeed other behavior. It is only called if all matchers came to a final conclusion.
In general, it's best to be explicit. If I read your template correctly you could also use a tls matcher. Because directly after the layer4 wrapper the tls wrapper is specified, so everything not looking like tls would fail there anyway.

@ydylla
Copy link
Collaborator

ydylla commented Oct 8, 2024

after the layer4 wrapper the tls wrapper is specified, so everything not looking like tls would fail there anyway

Which is apparently also not necessarily true. According to the docs the tls wrapper may be or is a noop:

There is a special no-op tls listener wrapper provided as a standard module which marks where TLS should be handled in the chain of listener wrappers. It should only be used if another listener wrapper must be placed in front of the TLS handshake. This does not enable TLS for a server; e.g. if this is used on your :80 HTTP server, it will still act as a no-op.

So using a tls machter may interfere with http -> https redirects. I have never used listener wrappers myself 🤷🏻‍♂️

@Monviech
Copy link

Monviech commented Oct 8, 2024

@ydylla Its a kinda specific but widely used usecase popularized by HA Proxy (I think).

People want to route some domains on their precious IPv4:443 port (since they dont want or can not use the benefits of almost unlimited IPv6 sockets) without TLS termination, and some of them with TLS termination and reverse proxied.

They also want to multiplex their SSH and RDP into 443 while still using the HTTP reverse proxy.

Thats why the listener wrapper seems to be quite important here.

But I could be wrong, actually never tried setting it up differently since I interpreted the configuration examples this way.

If theres a logic error I'm quite interested to find it.

@ydylla
Copy link
Collaborator

ydylla commented Oct 8, 2024

There is no logic error. One can use the listener wrapper for this. My point was I am not the best person to ask for advice for it. I still use configs where layer4 controls the socket and only forward http to the caddy http app listening on a unix socket. Because I find it more logical this way and since my configs are older than the listener wrapper.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants