Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apparent leak when sending repeated requests to Glacier #475

Open
LeifW opened this issue Jun 27, 2018 · 6 comments
Open

Apparent leak when sending repeated requests to Glacier #475

LeifW opened this issue Jun 27, 2018 · 6 comments
Milestone

Comments

@LeifW
Copy link
Contributor

LeifW commented Jun 27, 2018

To be able to upload arbitrarily large archives to AWS Glacier, I wrapped its multipart upload in a Conduit of ByteStrings.
However, it still uses more memory at once than the size of the entire archive.
The culprit seems to be the send $ uploadMultipartPart ... call run inside of a Counduit.mapM on the conduit.
http://hackage.haskell.org/package/amazonka-glacier-1.6.0/docs/Network-AWS-Glacier-UploadMultipartPart.html
If I replace uploadMultipartPart uploadId byteRange checksum chunk with:
liftIO $ BS.appendFile "/tmp/out" chunk
or even
liftIO $ runResourceT $ runReaderT (uploadMultipartPart uploadId byteRange checksum chunk) (GlacierEnv env glacierSettings)
the ballooning memory goes away.

@LeifW
Copy link
Contributor Author

LeifW commented Jun 27, 2018

This happens to me with using the bytestring conduit in amazonka-s3-streaming, as well

@LeifW
Copy link
Contributor Author

LeifW commented Jun 27, 2018

Looking at https://github.com/snoyberg/http-client/blob/9eb92877641db53efa179ea871a51d32989c6f52/http-conduit/Network/HTTP/Conduit.hs#L315, it looks like the result of Client.responseOpen is freed when you call responseBody on the response, or when you runResourceT.
I don't know if responseBody is being called on these responses in Amazonka, but running runResourceT around every request fixes the leak. So I don't want to have AWSConstraint or MonadAWS floating around my application - just immediately wrap all requests in runResourceT?

@LeifW
Copy link
Contributor Author

LeifW commented Jun 27, 2018

It looks like UploadMultipartPartResponse calls receiveEmpty. Other responses that call receiveXML or receiveJSON call responseBody, but I don't see receiveEmpty doing so.
Of course, the responses to these calls shouldn't be sizeable, or have any body for that matter - you'd think it'd be the requests that are leaking space if anything.

@endgame
Copy link
Collaborator

endgame commented Oct 4, 2021

Other effort to use Conduits with multipart uploads also seem to leak memory: axman6/amazonka-s3-streaming#22

Seems like there's no obvious easy fix, so this can go on the "post 2.0" pile.

@endgame
Copy link
Collaborator

endgame commented Apr 17, 2024

#523 might be another related laziness bug; I suspect we might have to force something during send.

https://www.snoyman.com/blog/2017/01/foldable-mapm-maybe-and-recursive-functions/ makes me suspect we might want/need to provide a "force and ignore result" variant of send .

@endgame
Copy link
Collaborator

endgame commented Jul 5, 2024

I know it's been a few years, but I don't suppose you still have access to a reproducer for this code? I think the linked snoyberg/http-client#538 is right, and I'm deeply suspicious of how the body is removed from the original request.

Pending a response from the http-client maintainers, we could try pulling out the original request and calling seq on its body in our Amazonka.Response functions?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants