-
Notifications
You must be signed in to change notification settings - Fork 330
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding initial fix for bug issue #1967 #2267
base: main
Are you sure you want to change the base?
Conversation
@blperf please read the following Contributor License Agreement(CLA). If you agree with the CLA, please reply with the following information.
Contributor License AgreementContribution License AgreementThis Contribution License Agreement (“Agreement”) is agreed to by the party signing below (“You”),
|
handlerMiddlewareFactory.createHandlerMiddleware()( | ||
new Context(res.locals, this.contextPath, request, response), | ||
newContext, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason change this?
Hi @blperf , Really appreciate your contribution. I tried to test the change with code like:
The
When the blob size is large enough, it will still meet the error. From what I investigated, to fully fix the issue, we may need to change to "open stream, read it and close handle" when sending response body. Do you have any thoughts? Thanks |
Hi @EmmaZhu, I see what you mean, I'll summarize to the best of my current understanding what is going on here. Before the fix in this PR, even smaller files when accessed too many times (using blobClient.download for example) would eventually result in enough open file descriptors to hit the OS limit. Because they were never closed, this would inevitably occur at some point with enough requests regardless of the file size, but happened sooner for larger files because more extents are being read per request. A less likely but still valid scenario you have brought up is, what happens if one single file is large enough that the number extents that needs to be read during a single blobClient.download request creates more open file descriptors than the OS limit? I think you've already correctly pointed out the only way to avoid this scenario, which would be to close the file descriptor immediately after reading each extent before sending it back to the client. Do you have any links you could share from your research on how this is typically implemented? Do you have any other thoughts on the best way to implement this? I'm wondering if there's some way to do this without changing the return type of readExtent, or if the return type would need to be changed. I'm not sure if it's possible to return a NodeJS.ReadableStream without keeping a file descriptor open, maybe it is somehow by reading from a value buffered in memory? If you already have some other code for this, please feel free to share a PR link and I would be happy to abandon this one if another better fix is available. I'm certainly willing to verify another fix resolves this issue on my side as well. If not I can potentially try to help rewrite this PR to something along the lines of the strategy mentioned above. |
@blperf Thanks a lot for such detailed summarize. Yes, that's what I mean. For the solution, I'm think of implementing a readableStream to only open file when it needs to read content and close the file right after the reading is completed to replace stream returned here: https://github.com/Azure/Azurite/blob/main/src/common/persistence/FSExtentStore.ts#L334 Another way to fix it can be: having a file path array instead of a readableStream array, and only create a readable stream to a specific extent file when needs to read its content. There might be other ways, as long as it only access the file when really necessary. It may require more changes to completely fix the issue. We'd really appreciate if you are interested and available in fixing it. If not, I'll be able to work on it in several days. Thanks |
Hi @EmmaZhu, Happy new year, I hope all is well. Have you by any chance had time to take a look at this? I unfortunately haven't had time to try implementing your proposed solution, but I can confirm I have also encountered some other errors eventually using the fix in this PR. It takes much longer before the issues arise, but unfortunately it seems like this PR wasn't a complete fix. Perhaps you can start a new PR with a draft of the implementation you had in mind? I'm happy to do my best to contribute to the PR and help out where I can and can also test things out on my end to help ensure the original bug is resolved. |
Thanks for contribution! Please go through following checklist before sending PR.
PR Branch Destination
main
branch.legacy-dev
branch.Always Add Test Cases
Make sure test cases are added to cover the code change.
Add Change Log
Add change log for the code change in
Upcoming Release
section inChangeLog.md
.Development Guideline
Please go to CONTRIBUTION.md for steps about setting up development environment and recommended Visual Studio Code extensions.