Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for opting out of streaming HTTP bodies #68

Merged
merged 16 commits into from
Apr 2, 2025

Conversation

hactar
Copy link
Contributor

@hactar hactar commented Mar 25, 2025

Motivation

swift-openapi-urlsession offers both a streaming and buffering implementation for sending/receiving data. Which implementation is used is currently based on the platform its running on, mainly because "streaming" is not supported on all platforms. It is currently not possible to "force" buffering, even if one wants/needs it.

While adding https://github.com/kean/Pulse to my app, a tool for debugging network requests, I noticed that Pulse was not showing request bodies (apple/swift-openapi-generator#749) - because Pulse appears not do work with the streamed data mode. Neither does https://github.com/pmusolino/Wormholy.

I have now made switching .buffered mode public, allowing developers to choose which mode they want to work with. This fixes the issue I have with Pulse - which previously was not showing any request body.

RocketSim_Screenshot_iPhone_16_Pro_6 3_2025-03-25_14 51 41

Modifications

I made Implementation, platformDefault and an already existing init public. I had to make the Implementation enum Sendable, else Xcode produced a warning.

Result

There are no functional changes, other than making previously private functionality public. Technically someone could now manually try to set "streaming" mode on a platform that doesn't support it, but then an error is thrown correctly.

Test Plan

Tested the change in my own app: switching to buffered mode was now possible, and now tools like Pulse work correctly.

@guoye-zhang
Copy link
Contributor

Would like to avoid exposing the buffer size in the API since those are low level implementation details. Personally I prefer a new initializer such as init(bufferedSession:) (name can be discussed).

@guoye-zhang guoye-zhang added the 🆕 semver/minor Adds new public API. label Mar 25, 2025
@guoye-zhang
Copy link
Contributor

One terminology for this in a recent draft is "incremental", e.g. https://datatracker.ietf.org/doc/draft-kazuho-httpbis-incremental-http/00/

So maybe nonIncrementalSession?

@hactar
Copy link
Contributor Author

hactar commented Mar 26, 2025

Alright, I understand that we want to hide the parameter details of the streamed version. I'll adjust the PR.

As for naming, while "incremental" is a good name, it makes naming the other case difficult - I am not the biggest fan of "negative" naming, which "nonIncremental" would be. Why not stick to .streamed and .buffered (or .streaming and .buffering)? A quick google search tells me this terminology is widely used to describe exactly this.

My suggestion:
Make the Implementation enum private again and the exposed init private again. Instead, create a new enum with no additional parameters:

/// Specifies the mode in which HTTP request and response bodies are processed.
public enum HTTPBodyProcessingMode {
    /// Processes the HTTP body incrementally as bytes become available.
    ///
    /// Use this mode to handle large payloads efficiently or to begin processing
    /// before the entire body has been received. Will throw a `URLSessionTransportError.streamingNotSupported`
    /// error if not available on the platform.
    case streamed

    /// Waits until the entire HTTP body has been received before processing begins.
    ///
    /// Use this mode when it's necessary or simpler to handle complete data payloads at once.
    case buffered
}

Then create a new public init, additionally to the existing one: public init(session: URLSession = .shared, httpBodyProcessingMode: HTTPBodyProcessingMode)

In the init, we then either initialize with the .buffered Implementation or with the .streaming Implementation with the hardcoded buffer size values accordingly.

This way developers can choose between .streamed and .buffered themselves, but cannot use bad buffer sizes by mistake.

The existing init remains and handles the case where devs didn't specify anything, in which case the platformDefault continues to be selected, so nothing changes for existing users or users that do not manually provide a choice.

@hactar
Copy link
Contributor Author

hactar commented Mar 26, 2025

@guoye-zhang adjusted the PR based on your comments. Chose buffered and streamed as the names based on my previous comment, but if you prefer incremental and nonIncremental I can adjust it again.

@guoye-zhang
Copy link
Contributor

Looks fine. Please run swift-format and add a comment.

@hactar
Copy link
Contributor Author

hactar commented Mar 26, 2025

Looks fine. Please run swift-format and add a comment.

Done.

@hactar hactar requested a review from guoye-zhang March 26, 2025 16:39
@guoye-zhang
Copy link
Contributor

@simonjbeaumont What do you think?

@guoye-zhang guoye-zhang added the kind/feature New feature. label Mar 26, 2025
Copy link
Collaborator

@simonjbeaumont simonjbeaumont left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@simonjbeaumont What do you think?

I'm not sure about the motivation as a debugging mechanism since it changes the codepath significantly from what would actually be used by default. I'm also concerned about offering such a tempting option for folks who don't consider the implications properly since buffering the whole response body can have a significant impact for large requests.

Do we have an understanding of why this is necessary before we make it?

I haven't used Pulse myself, but it seems to have several documented operating mode. FWICT it provides both a URLSessionProxy and a URLSessionProxyDelegate—presumably you have tried providing a session that has this delegate installed?

Potentially the issue here is that the extension that this OpenAPI transport provides, URLSession.bidirectionalStreamingRequest(...) is installing it's own BidirectionalStreamingURLSessionDelegate as the task delegate. Presumably this means that there are methods on the Pulse delegate that are no longer called.

@guoye-zhang might have some advice here, but we might be able to layer our delegate with whatever existing delegate might be there too, which might avoid the need for this.

If we do go this route, then I provided some feedback on the use of an enum in public API.

It's also changed the behaviour for users trying to use streaming on unsupported platforms. Before, this would fail at the initializer and now it fails on the first request. This is because you provided a new public initializer that bypassed the existing initializer. I'd prefer we refactor to delegate all the calls through one initializer to consolidate the platform support checks.

@@ -70,6 +70,31 @@ public struct URLSessionTransport: ClientTransport {
/// - Parameter session: The URLSession used for performing HTTP operations.
/// If none is provided, the system uses the shared URLSession.
public init(session: URLSession = .shared) { self.init(session: session, implementation: .platformDefault) }
/// Specifies the mode in which HTTP request and response bodies are processed.
public enum HTTPBodyProcessingMode {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We generally do not use enums in public API because, right now, they are not extensible without breaking API. Instead for enum-like public API, we wrap an internal enum in a public struct with public static properties.

@hactar
Copy link
Contributor Author

hactar commented Mar 27, 2025

I'm a bit surprised about the pushback here, so maybe I'm misunderstanding something here:

To my knowledge, iOS urlsession data tasks use "buffered" mode per default. If you go to typical tutorial sites such as https://www.avanderlee.com/concurrency/urlsession-async-await-network-requests-in-swift/ or even apple's own tutorial code from https://developer.apple.com/videos/play/wwdc2021/10095 developers are taught to do this:

Screenshot 2025-03-27 at 11 05 33

which as far as I understand is a buffered request?

For pretty much any non-big request type, this makes perfect sense: things like json aren't usually live parsed, they are only parsed once all data arrives?

So for me, the "streaming" case, the one that swift-openapi-urlsession prefers, is the outlier. Don't get me wrong, I understand that it is better - but shouldn't we allow developers to choose the other option if they need it, especially because its the usual client default for iOS developers using URLSession? If you're really worried we could add more documentation, but developers have to go out of their way to change modes, when no mode is specified, the "best" option is chosen for them as it always was. This change has no effect on existing projects.

I agree with you that finding a way of making the streaming path work with things like pulse and wormholy is an even better option - but I'm worried that no-one has the time to actually go down that rabbit hole - and so blocking this "quick fix" will lead to no solution for the near future?

https://github.com/kean/Pulse has 6.5k stars on Github, wormholy has 2.5k - so I'd argue they are not some niche thing no-one uses - and therefore a fix for this would be good...

All my arguments hinge on my understanding of iOS's let (data, response) = try await URLSession.shared.data(for: request), so please tell me if I'm wrong here 😊

Happy to the make adjustments to remove the public enum if this PR has a chance to be merged.

@simonjbeaumont
Copy link
Collaborator

I'm a bit surprised about the pushback here, so maybe I'm misunderstanding something here
...
but I'm worried that no-one has the time to actually go down that rabbit hole - and so blocking this "quick fix" will lead to no solution for the near future?

Before we go further, I didn't mean my response to be considered pushing back on getting a solution here, or getting one in a timely manner. Asking if we this could/should work with the default is the responsible thing to do. This also does not mean it's an either-or: it would be good to investigate a solution regardless of whether we merge this PR.

So for me, the "streaming" case, the one that swift-openapi-urlsession prefers, is the outlier. Don't get me wrong, I understand that it is better - but shouldn't we allow developers to choose the other option if they need it, especially because its the usual client default for iOS developers using URLSession?

The Swift OpenAPI ecosystem chose a streaming-first approach so it could be suitable for both servers and clients since there are plenty of use cases where streaming will be a requirement, both for server-side and on-device use cases. Putting aside this discovery that the current implementation is not composing with other tooling, I'm not aware why anyone would express buffering as a requirement over streaming.

Returning to this issue: that this doesn't work is meaningful and if there are ways to address it holistically (by altering the implementation) we should.

but I'm worried that no-one has the time to actually go down that rabbit hole - and so blocking this "quick fix" will lead to no solution for the near future?

I'm sympathetic to offering an escape hatch, in the event we identify that the streaming mode cannot be fixed or even if we identify that such a fix would take too much time.

All my arguments hinge on my understanding of iOS's let (data, response) = try await URLSession.shared.data(for: request), so please tell me if I'm wrong here 😊

You have understood that API correctly, but that API is just one of many in URLSession for making requests. There are APIs for doing streaming too and, for the reasons above, we chose to build on those.

Happy to the make adjustments to remove the public enum if this PR has a chance to be merged.

Thank you. If there's no timely solution for streaming, then of course.

@guoye-zhang
Copy link
Contributor

Oh yeah, forgot about the problems with enums. It's probably better to add a dedicated initializer init(bufferedSession:) if we decide to go this route.

@hactar
Copy link
Contributor Author

hactar commented Mar 28, 2025

Thanks for the clarification. I have adjusted the PR to remove the introduced public enum and use public struct instead, and unified the init as requested.

@hactar hactar requested a review from simonjbeaumont March 28, 2025 12:24
@simonjbeaumont
Copy link
Collaborator

simonjbeaumont commented Mar 28, 2025

@hactar thank you for some patience, since I had never tried to use Pulse before. I have now followed the documentation and been able to integrate it in the HelloWorldiOSClientApp example without making any changes to the Swift OpenAPI URLSession transport.

There appears to be several documented approaches in Pulse:

Pulse works on the URLSession level, and it needs access to its callbacks to log network requests and capture network metrics. The framework is modular and provides multiple options that can accommodate almost any system.
— source: https://kean-docs.github.io/pulse/documentation/pulse/networklogging-article

Which one you can use will depend on what other components and libraries are in your stack.

Using the URLSessionProxy is not currently an option since, in this pattern, you make calls through an any URLSessionProtocol, which is a wrapper protocol provided by Pulse.

Using the PulseProxy module worked fine, although it comes with documented caveats because of its use of swizzling.

Using URLSessionProxyDelegate seems to be what we should support since this is called out as the solution when using Pulse with other frameworks. However this has it's own documented caveats:

This method supports a limited subset of scenarios and doesn’t work with URLSession Async/Await APIs.

This currently doesn't work because we do not call the session-wide delegate in addition to the task delegate. The Pulse ConsoleView shows this as a pending request. Presumably this is what you were trying to use, and saw @hactar?

I was however able to get this working with a patch to the OpenAPI URLSessionTransport, where we call into the session-installed delegate in addition to the internal task-installed delegate. This may be something we want to explore. For example, adding the following to the urlSession(_:dataTask:didReceive data), and similar in the other callbacks:

(session.delegate as? any URLSessionDataDelegate)?.urlSession?(session, dataTask: dataTask, didReceive: data)

I also note one more option in the Pulse documentation, which is to use the NetworkLogger type it provides manually. The example it gives is to call those functions in a URLSession delegate so this wouldn't be an option today either, but might be something we could support instead of the previous.

To summarise I think we have two options here:

  1. Change the internals of OpenAPIURLSession.BidirectionalStreamingURLSessionDelegate task delegate to also call the session-delegate callbacks if they are non-nil, either:
    a. Unilaterally; or
    b. Behind an opt-in new config setting to preserve existing behaviour.
  2. Additive API to provide an additional delegate that we'll call in addition to ours. This could be used with the Pulse NetworkLogger as they describe.

We'd need to consider the implications of (1) since it might not make sense to support all callbacks and for (2) we could provide the callbacks that would support this.

WDYT @czechboy0 @hactar @guoye-zhang

EDIT: looking closer at the docs (https://kean-docs.github.io/pulse/documentation/pulse/networklogging-article#Option-4-Manual) I think we'd only need to call the following methods from the session delegate:

  • request(_ request: Request, didCreateTask task: URLSessionTask)
  • urlSession(_ session: URLSession, dataTask: URLSessionDataTask, didReceive data: Data)
  • urlSession(_ session: URLSession, task: URLSessionTask, didFinishCollecting metrics: URLSessionTaskMetrics)
  • urlSession(_ session: URLSession, task: URLSessionTask, didCompleteWithError error: Error?)

So I'm more in the option (1) camp. I'll see if that works.

@simonjbeaumont
Copy link
Collaborator

The URLSession documentation says that the task delegate is called before the session delegate:

This task-specific delegate receives messages from the task before the session’s delegate receives them.

— source: https://developer.apple.com/documentation/foundation/urlsessiontask/3746977-delegate#discussion

Does this mean that URLSession itself should call the session delegate after the task delegate for each of the callbacks or that the task delegate should do this? //cc @guoye-zhang

@simonjbeaumont
Copy link
Collaborator

Looks like these need to be manually called:

    func test() async throws {
        final class Delegate: NSObject, URLSessionDelegate, URLSessionTaskDelegate, URLSessionDataDelegate {
            let name: String
            let finished: XCTestExpectation

            init(name: String) {
                self.name = name
                self.finished = XCTestExpectation(description: name)
            }

            func urlSession(_ session: URLSession, dataTask: URLSessionDataTask, didReceive data: Data) {
                print("\(name): \(#function)")
                guard let outie = (session.delegate as? any URLSessionDataDelegate), outie !== self else { return }
                outie.urlSession?(session, dataTask: dataTask, didReceive: data)
            }

            func urlSession(_ session: URLSession, task: URLSessionTask, didFinishCollecting metrics: URLSessionTaskMetrics) {
                print("\(name): \(#function)")
                guard name == "innie" else { return }
                guard let outie = (session.delegate as? any URLSessionTaskDelegate), outie !== self else { return }
                outie.urlSession?(session, task: task, didFinishCollecting: metrics)
            }

            func urlSession(_ session: URLSession, task: URLSessionTask, didCompleteWithError error: Error?) {
                print("\(name): \(#function)")
                self.finished.fulfill()
                guard let outie = (session.delegate as? any URLSessionTaskDelegate), outie !== self else { return }
                outie.urlSession?(session, task: task, didCompleteWithError: error)
            }
        }

        let outie = Delegate(name: "outie")
        let innie = Delegate(name: "innie")

        let session = URLSession(configuration: .default, delegate: outie, delegateQueue: nil)

        let task = session.dataTask(with: URL(string: "http://localhost:8080/api/greet?name=Si")!)
        task.delegate = innie
        task.resume()

        await fulfillment(of: [innie.finished, outie.finished])
    }
Test Case '-[HelloWorldiOSClientAppTests.URLSessionDelegatePrecedenceTests test]' started.
innie: urlSession(_:dataTask:didReceive:)
outie: urlSession(_:dataTask:didReceive:)
innie: urlSession(_:task:didFinishCollecting:)
outie: urlSession(_:task:didFinishCollecting:)
innie: urlSession(_:task:didCompleteWithError:)
outie: urlSession(_:task:didCompleteWithError:)
Test Case '-[HelloWorldiOSClientAppTests.URLSessionDelegatePrecedenceTests test]' passed (0.045 seconds).

@simonjbeaumont
Copy link
Collaborator

OK, so I think this can be addressed with this pretty targeted patch to make Pulse work for streaming:

diff --git a/Sources/OpenAPIURLSession/URLSessionBidirectionalStreaming/BidirectionalStreamingURLSessionDelegate.swift b/Sources/OpenAPIURLSession/URLSessionBidirectionalStreaming
/BidirectionalStreamingURLSessionDelegate.swift
index f457582..f64e947 100644
--- a/Sources/OpenAPIURLSession/URLSessionBidirectionalStreaming/BidirectionalStreamingURLSessionDelegate.swift
+++ b/Sources/OpenAPIURLSession/URLSessionBidirectionalStreaming/BidirectionalStreamingURLSessionDelegate.swift
@@ -156,6 +156,8 @@ final class BidirectionalStreamingURLSessionDelegate: NSObject, URLSessionTaskDe
                 debug("Task delegate: response stream consumer terminated, cancelling task")
                 dataTask.cancel()
             }
+            guard let outie = (session.delegate as? any URLSessionDataDelegate), outie !== self else { return }
+            outie.urlSession?(session, dataTask: dataTask, didReceive: data)
         }
     }

@@ -179,6 +181,8 @@ final class BidirectionalStreamingURLSessionDelegate: NSObject, URLSessionTaskDe
                 responseContinuation = nil
             }
         }
+        guard let outie = (session.delegate as? any URLSessionTaskDelegate), outie !== self else { return }
+        outie.urlSession?(session, task: task, didCompleteWithError: error)
     }
 }

If we determined this a sensible thing to do, with or without a config option, presumably that would suffice @hactar ?

@hactar
Copy link
Contributor Author

hactar commented Mar 28, 2025

@simonjbeaumont Thank you for taking the time to do the deep dive. It's very appreciated.

Using the URLSessionProxy is not currently an option since, in this pattern, you make calls through an any URLSessionProtocol, which is a wrapper protocol provided by Pulse.

Using the PulseProxy module worked fine, although it comes with documented caveats because of its use of swizzling.

Using URLSessionProxyDelegate seems to be what we should support since this is called out as the solution when using Pulse with other frameworks. However this has it's own documented caveats:

This method supports a limited subset of scenarios and doesn’t work with URLSession Async/Await APIs.

This currently doesn't work because we do not call the session-wide delegate in addition to the task delegate. The Pulse ConsoleView shows this as a pending request. Presumably this is what you were trying to use, and saw @hactar?

I've been using the PulseProxy module via NetworkLogger.enableProxy(), the main reason is that because it swizzles itself in, I can see calls from 3rd party libraries where I don't control the session - which is not possible via the 2 other methods.

I did also try the URLSessionProxy option for testing, because it does expose its session via .session, and I tried feeding that to the URLSessionTransport.Configuration - but all requests end up in a pending state here.

I did not try URLSessionProxyDelegate as I didn't see a way of injecting this delegate into the OpenAPI URLSession - and the delegate actually being called.

Not in front of my dev computer at the moment, but will try your BidirectionalStreamingURLSessionDelegate.swift changes in my app later tonight, see if it works in my project and will report back. Thanks again!

@simonjbeaumont
Copy link
Collaborator

I've been using the PulseProxy module via NetworkLogger.enableProxy(), the main reason is that because it swizzles itself in, I can see calls from 3rd party libraries where I don't control the session - which is not possible via the 2 other methods.

Are you saying this did not work for you? This one worked fine for me with no changes.

@guoye-zhang
Copy link
Contributor

The URLSession documentation says that the task delegate is called before the session delegate:

This task-specific delegate receives messages from the task before the session’s delegate receives them.
— source: https://developer.apple.com/documentation/foundation/urlsessiontask/3746977-delegate#discussion

Does this mean that URLSession itself should call the session delegate after the task delegate for each of the callbacks or that the task delegate should do this? //cc @guoye-zhang

The phrasing is a bit misleading. If a method is implemented on the task delegate, it won't be called on the session delegate.

@guoye-zhang
Copy link
Contributor

diff --git a/Sources/OpenAPIURLSession/URLSessionBidirectionalStreaming/BidirectionalStreamingURLSessionDelegate.swift b/Sources/OpenAPIURLSession/URLSessionBidirectionalStreaming
/BidirectionalStreamingURLSessionDelegate.swift
index f457582..f64e947 100644
--- a/Sources/OpenAPIURLSession/URLSessionBidirectionalStreaming/BidirectionalStreamingURLSessionDelegate.swift
+++ b/Sources/OpenAPIURLSession/URLSessionBidirectionalStreaming/BidirectionalStreamingURLSessionDelegate.swift
@@ -156,6 +156,8 @@ final class BidirectionalStreamingURLSessionDelegate: NSObject, URLSessionTaskDe
                 debug("Task delegate: response stream consumer terminated, cancelling task")
                 dataTask.cancel()
             }
+            guard let outie = (session.delegate as? any URLSessionDataDelegate), outie !== self else { return }
+            outie.urlSession?(session, dataTask: dataTask, didReceive: data)
         }
     }

@@ -179,6 +181,8 @@ final class BidirectionalStreamingURLSessionDelegate: NSObject, URLSessionTaskDe
                 responseContinuation = nil
             }
         }
+        guard let outie = (session.delegate as? any URLSessionTaskDelegate), outie !== self else { return }
+        outie.urlSession?(session, task: task, didCompleteWithError: error)
     }
 }

Let's also forward the didReceiveResponse callback since that's probably needed

@hactar
Copy link
Contributor Author

hactar commented Mar 28, 2025

I've been using the PulseProxy module via NetworkLogger.enableProxy(), the main reason is that because it swizzles itself in, I can see calls from 3rd party libraries where I don't control the session - which is not possible via the 2 other methods.

Are you saying this did not work for you? This one worked fine for me with no changes.

90% of the view works, but not everything. Requests are logged. Their state is reported correctly. But the request body and the response body do not work correctly. Here are some screenshots.

".streamed" mode:
Notice how in the top right it says: body 924 bytes - yet in the Response Body part the Response Body is not tappable and says empty. The top right is correct, there was a response body.
And if you tap the Request Body - it doesn't show you anything - it claims a file was posted, it was not a file, it was json via openapi generator.

".buffered" mode:
When I switch to .buffered mode, everything is as expected, the request body is tapable and renders as json, and the response body is tapable and renders as json.


@hactar
Copy link
Contributor Author

hactar commented Mar 28, 2025

So I've tested adding @simonjbeaumont and @guoye-zhang proposed outie (is this a Severence reference? 😊) changes to BidirectionalStreamingURLSessionDelegate. The good news is that the requests are now no longer pending in the log, but are correctly marked as complete or failed - so thats fixed. The bad news is that, as with the PulseProxy option, request data and response data does not correctly appear in the PulseUI - like in the .streamed version of the screenshots above.

@simonjbeaumont
Copy link
Collaborator

@hactar thanks for giving it a try and reporting back. I have now reproduced what you're seeing. When I tested previously it was with a GET request and it all worked fine. But when switching to a POST request, with a request body, I can see that Pulse is unable to show me the request body and says it was streamed from a file.

This all adds up since we have this logic inside the OpenAPI URLSession transport:

        let task: URLSessionTask
        if requestBody != nil {
            task = uploadTask(withStreamedRequest: urlRequest)
        } else {
            task = dataTask(with: urlRequest)
        }

IMO we've now found the local maximum for how we can play nicely with Pulse but that it won't allow the same visibility as using buffered URLSession APIs. I'm pretty confident based on what we've found so far that this would be the same if an adopter was driving the URLSession streamed APIs manually though, and that the situation is not worsened by the use of the OpenAPI transport layer.

So, I think the current open questions are:

  1. Should we make the change to call the session delegate from our task delegate? IIUC without this Pulse does work with the swizzling NetworkProxy approach but does not work with the session delegate approach? (I realise that the definition of "work" here is still limited to not having visibility into the streamed bodies).
  2. Should we support opting out of streaming under the hood? IIUC this would allow for full visibility of the request/response bodies with Pulse.

I'm open to both of these changes.

If you can confirm that (1) makes no meaningful improvement over doing nothing (because folks can still use the NetworkProxy then we can leave it.

I appreciate you giving us the time to think this through.

WDYT @guoye-zhang @czechboy0 ?

@hactar
Copy link
Contributor Author

hactar commented Mar 31, 2025

So, I think the current open questions are:

  1. Should we make the change to call the session delegate from our task delegate? IIUC without this Pulse does work with the swizzling NetworkProxy approach but does not work with the session delegate approach? (I realise that the definition of "work" here is still limited to not having visibility into the streamed bodies).

I have no strong opinion here. It's an improvement, but doesn't fully work. As Pulse is a debugging tool, developers would most likely need to see the whole request - a solution that goes half the way is of course better than nothing, but we appear to have an alternative to offer "full" support via buffering. Also, streaming really becomes the better option for larger amounts of data - usually in binary form. Seeing that in Pulse is probably less useful, Pulse is more about debugging human readable data thats being transmitted. I therefore I would try to answer two questions:

  • Could adding this break something else?
  • Does adding this making maintaining it more difficult in the long run?

If the answer is no to both, then sure, why not. But else, maybe the disadvantages would then outweigh the advantages for something that might not be that useful to developers anyway?

  1. Should we support opting out of streaming under the hood? IIUC this would allow for full visibility of the request/response bodies with Pulse.

I would argue yes, especially as we found no full solution via path (1). I don't think its likely that developers would enable buffered mode "by mistake" as streaming is still the default where available - and even they do, unless they are transmitting large amounts of data, I'd assume the actual performance difference is negligible - especially for clients. I don't think that when someone is posting 100 KB json, and receiving a 100 KB json response, buffering is going them hit me hard (especially since thats what most devs are taught to do client side on iOS - I was actually pleasantly surprised to see that openapi generator streams here). For some platforms there isn't a streamed implementation, and in these cases buffered seems to be doing a good enough job - so I don't think buffering is that much of a dangerous option.

If you can confirm that (1) makes no meaningful improvement over doing nothing (because folks can still use the NetworkProxy then we can leave it.

For my projects, fixing (1) would have the advantage that I wouldn't have to remember to switch to .buffered mode for debugging - but thats pretty much it. The underlying issue (debugging network calls with a good UI) is fully solved by (2) - and I can even add a switch to my apps to enable "buffered debug mode" when I want to debug network calls, while using .streamed in production mode where I don't want to enable Pulse anyway.

I appreciate you giving us the time to think this through.

The appreciation is mutual. It's very interesting watching the discussion and seeing the proposed changes for (1), I'm learning a lot about the fine details of URLSession.

@czechboy0
Copy link
Contributor

Now that we seem to have a pretty good understanding of the options, I'm happy for us to add a config option to force buffering. I didn't love the spelling of the enum, which would allow for things like asking for streaming even on platforms that don't support it, but if we spelled it e.g. forceBuffering: true or similar, meaning that true would downgrade to buffering everywhere, and false would let the platform default make the decision, that'd work for me.

Copy link
Collaborator

@simonjbeaumont simonjbeaumont left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, let's go ahead and support opting out of streaming.

I left one comment on the current state of the patch, which I think has something left over from an earlier design.

The remaining question of is around the API and whether we allow users to request something that would cause a runtime error.

Previously this enum was entirely internal and, while selecting streaming on a platform that doesn't support it would throw an error, there was no public API to do so, so no user code could lead to this.

Should we prefer an enum-like option that supports just .platformDefault, and .buffering; and NOT .streaming in this case?

This comment is expressing similar, presuming I read it correctly, @czechboy0?

Comment on lines 378 to 381
return .defaultStreaming
}
static var defaultStreaming: Self {
.streaming(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part of the patch seems unnecessary now?

@czechboy0
Copy link
Contributor

Should we prefer an enum-like option that supports just .platformDefault, and .buffering; and NOT .streaming in this case?

Yup that's great.

@hactar
Copy link
Contributor Author

hactar commented Apr 1, 2025

Cool. I'll make the proposed changes.

@hactar
Copy link
Contributor Author

hactar commented Apr 1, 2025

Done.

@hactar hactar requested a review from simonjbeaumont April 1, 2025 11:19
@hactar
Copy link
Contributor Author

hactar commented Apr 1, 2025

Tests should be fixed now - I re-introduced the private init, as thats required for the tests which want for force .streamed, which is not possible via our public init.

BTW, on my mac, to be able to run tests locally, I had to add import NIOHTTP1 to TaskCancellationTests.swift for the tests to compile? I did not push this change as its unrelated and the CI appears to work without it?

@@ -67,32 +67,52 @@ public struct URLSessionTransport: ClientTransport {
public var session: URLSession

/// Creates a new configuration with the provided session.
/// - Parameter session: The URLSession used for performing HTTP operations.
/// If none is provided, the system uses the shared URLSession.
public init(session: URLSession = .shared) { self.init(session: session, implementation: .platformDefault) }
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding new parameters to existing functions is an API break, as caught by CI.

1 breaking change detected in OpenAPIURLSession:
  💔 API breakage: constructor URLSessionTransport.Configuration.init(session:) has been removed

We'd need to preserve this symbol that calls through to the new one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand the issue.

The new API provides public init(session: URLSession = .shared, httpBodyProcessingMode: HTTPBodyProcessingMode = .platformDefault) - and as there is a default parameter for httpBodyProcessingMode, doesn't Swift auto generate a public init(session: URLSession) for us?

The following code in my app compiles fine:

    func testFunction() async throws {
        
        let sessionConfiguration = URLSessionConfiguration.default
        let otherSession = URLSession(configuration: sessionConfiguration)
        
        let transportConfiguration = URLSessionTransport.Configuration.init(session: .shared)
        let otherTransportConfiguration = URLSessionTransport.Configuration.init(session: otherSession)
        
        
        let transport1 = URLSessionTransport(configuration: transportConfiguration)
        let transport2 = URLSessionTransport(configuration: otherTransportConfiguration)
        
        let client1 = Client(serverURL: URL(string: "https://example.com")!, transport: transport1)
        let client2 = Client(serverURL: URL(string: "https://example.com")!, transport: transport2)
        
        _ = try await client1.get_sol_Appversions_sol_All()
        _ = try await client2.get_sol_Appversions_sol_All()
    }

So to me, there still is an (implicit) URLSessionTransport.Configuration.init(session:) - isn't this API breakage error a false positive?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not a false positive. It's possible to refer to the function itself (e.g. let initializer = URLSessionTransport.Configuration.init(session:)) which won't work if the function is no longer present.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation. Interesting, I wasn't aware of this limitation of default parameter values. I have added back the old single parameter init.

@hactar hactar requested a review from simonjbeaumont April 1, 2025 20:27
Copy link
Collaborator

@simonjbeaumont simonjbeaumont left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM now. Thanks for contributing here @hactar — both with the change and to the discussion.

@simonjbeaumont simonjbeaumont changed the title Make URLSessionTransport.Configuration implementation choice public Add support for opting out of streaming HTTP bodies Apr 2, 2025
@simonjbeaumont simonjbeaumont merged commit ea1f9d3 into apple:main Apr 2, 2025
26 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature New feature. 🆕 semver/minor Adds new public API.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants