Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Different content-length response behaviour on Java 11 vs Java 17 #519

Open
dzhus opened this issue Jan 10, 2025 · 9 comments
Open

Different content-length response behaviour on Java 11 vs Java 17 #519

dzhus opened this issue Jan 10, 2025 · 9 comments

Comments

@dzhus
Copy link

dzhus commented Jan 10, 2025

Using dde9716, I'm observing the following response difference around transfer-encoding + content-length headers between Java 11 and Java 17, just using the wiki example as a minimal case + lein repl.

Java 11

Clojure 1.9.0
OpenJDK 64-Bit Server VM 11.0.21+9

user=> (defn handler [_request] {:status 200 :headers {"Content-Type" "text/html"} :body "Hello World"})
user=> (ring.adapter.jetty/run-jetty handler {:port 3000 :join? false})
% curl -v localhost:3000
* Host localhost:3000 was resolved.
* IPv6: ::1
* IPv4: 127.0.0.1
*   Trying [::1]:3000...
* Connected to localhost (::1) port 3000
> GET / HTTP/1.1
> Host: localhost:3000
> User-Agent: curl/8.7.1
> Accept: */*
> 
* Request completely sent off
< HTTP/1.1 200 OK
< Date: Fri, 10 Jan 2025 10:42:39 GMT
< Content-Type: text/html
< Content-Length: 11
< Server: Jetty(11.0.24)
< 
* Connection #0 to host localhost left intact
Hello World%  

Java 17

Clojure 1.9.0
OpenJDK 64-Bit Server VM 17.0.13+11

user=> (defn handler [_request] {:status 200 :headers {"Content-Type" "text/html"} :body "Hello World"})
user=> (ring.adapter.jetty/run-jetty handler {:port 3000 :join? false})
% curl -v localhost:3000
* Host localhost:3000 was resolved.
* IPv6: ::1
* IPv4: 127.0.0.1
*   Trying [::1]:3000...
* Connected to localhost (::1) port 3000
> GET / HTTP/1.1
> Host: localhost:3000
> User-Agent: curl/8.7.1
> Accept: */*
> 
* Request completely sent off
< HTTP/1.1 200 OK
< Date: Fri, 10 Jan 2025 10:39:31 GMT
< Content-Type: text/html
< Transfer-Encoding: chunked
< Server: Jetty(11.0.24)
< 
* Connection #0 to host localhost left intact
Hello World%   

So Content-Length: 11 in the response switched over to Transfer-Encoding: chunked. Any thoughts on this? Both variants look correct, but does anyone know if this is expected, and if the change is even thanks to ring or something else in the stack (jetty / Java libraries)?

@vincentjames501
Copy link

vincentjames501 commented Feb 17, 2025

I can confirm this same behavior and is a bit unexpected. I suspect this has something to do with using a PrintWriter because if I change the response body to be bytes, I do get content-length in the response.

(extend-protocol StreamableResponseBody
  String
  (write-body-to-stream [body response output-stream]
    (doto (response-writer response output-stream)
      (.write body)
      (.close)))

I think the flushing behavior changed on newer jdks.

For example, if I re-extend String with the following:

(extend-protocol p/StreamableResponseBody
  String
  (write-body-to-stream [body response output-stream]
    (.print output-stream ^String body)
    (.flush output-stream)))

If I manually do the flush (or close the writer which triggers a flush), Transfer-Encoding: chunked is set. If I DO NOT trigger the flush manually (or automatically as done via printwriter), then I get a Content-Length header.

I tried changing to Jetty 12 too via [info.sunng/ring-jetty9-adapter "0.36.1"] (disregard the name, it's using Jetty 12) and it has the same issue. I'm not an HTTP RFC expert but generally find it nice to return content-length whenever possible.

Personally, I think ring-jetty should try to automatically determine/set the content length depending on the body type. For example String/bytes/File/etc we know the content length and should probably set it via (.setContentLength response xxx). The problem is we don't have the raw HttpServletResponse object in the protocol to set it there so seems like we'd want another one or to add it somehow w/ a possible new version/breaking change to ring.util.jakarta.servlet.

@pwinckles
Copy link

This is the commit that introduces the behavior change: openjdk/jdk@69ca2e9

When ring writes to the response it writes to a writer that's constructed as BufferedWriter(OutputStreamWriter(HttpOutput)). The OutputStreamWriter internally uses a StreamEncoder that writes to the underlying stream. StreamEncoder.close() was changed in the above commit to call flush() on the underlying stream, where it used to only write the remaining bytes and then close the stream.

All of that said, why is ring writing to the response like this in the first place? That is to say, why is it building a new writer rather than using the writer from HttpServletResponse.getWriter()? The writer returned from that method does not exhibit this issue.

@weavejester
Copy link
Member

Thanks for the investigation.

Ring uses the OutputStream for the protocol, as an OutputStream is byte-orientated, while a Writer is character-orientated. An OutputStream can be used be used for any data, while a Writer is limited to strings and characters.

It's very odd that flushing the stream should change Jetty's behavior to sending a chunked encoding. I can't immediately think of a reason why that would be useful behavior to have.

@pwinckles
Copy link

Ring uses the OutputStream for the protocol, as an OutputStream is byte-orientated, while a Writer is character-orientated. An OutputStream can be used be used for any data, while a Writer is limited to strings and characters.

That makes sense, but if the body is a String the built-in writer could be used. The issue doesn't exist when you write directly to the HttpOutput, which is what happens in all cases except for String and ISeq.

@vincentjames501
Copy link

One other thing to keep in mind is that Jetty has an internal buffer defined by:

:output-buffer-size     - the response body buffer size (default 32768)

So if you return a byte-array > 32768 bytes it still won't put a content-length header since it filled the first buffer for flushing (I suspect this would be true as well with a large String response). I think there are opportunities where we know the content-length of many objects where we could be automatically suppling them. According to the RFC you SHOULD supply a content-length header when the length is known ahead of time. So if the response object is bytes/File/ByteArrayInputStream/etc we could automatically set the content-length appropriately.

@weavejester
Copy link
Member

That makes sense, but if the body is a String the built-in writer could be used.

Yes, but that would require a new core protocol and an addition to the SPEC, all in order to deal with a particular idiosyncrasy of Jetty.

I think there are opportunities where we know the content-length of many objects where we could be automatically suppling them.

Yes. Reasonably it could be some middleware that sets the Content-Length header for known types, perhaps using a protocol.

@vincentjames501
Copy link

@weavejester , Is it's time to bump to Jetty 12, run in standalone (no servlet), and remove the StreamableResponseBody protocol in favor of one that has access to the raw Response object vs an output stream? Seems like a lot could be copied from https://github.com/sunng87/ring-jetty9-adapter . I realize it'd be a breaking change but likely so would the options passed to configure Jetty.

@weavejester
Copy link
Member

Is it's time to bump to Jetty 12, run in standalone (no servlet),

Updating to Jetty 12 is certainly a goal as soon as I find the time to do so, or someone else does it. Unfortunately, it's a fairly big job.

and remove the StreamableResponseBody protocol

That's not going to happen as that would break backward compatibility.

in favor of one that has access to the raw Response object vs an output stream

I also don't know what you mean by "the raw Response object". Do you mean the Jetty Response object? If so, that wouldn't work with adapters that don't use Jetty.

Remember that Ring needs to work with a wide variety of different backends, so we need to use the lowest level interface that's possible. In words, it needs to be a synchronous stream of bytes, which in Java is represented by the OutputStream class.

I realize it'd be a breaking change but likely so would the options passed to configure Jetty.

The options shouldn't change, and breaking changes would be antithetical to this project's goals.

@vincentjames501
Copy link

@weavejester . Ok yeah I agree with not breaking the protocol and yeah, maybe some sort of wrap-content-length middleware seems appropriate. This is the path we're taking in our current project for the time being.

The options shouldn't change, and breaking changes would be antithetical to this project's goals.

I understand ring.core. Thankfully it does look like most of the options stay the same even if it moved away from the servlet api.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants