-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mysql: do not allocate in parseOKPacket #15067
Conversation
Signed-off-by: Vicent Marti <[email protected]>
Review ChecklistHello reviewers! 👋 Please follow this checklist when reviewing this Pull Request. General
Tests
Documentation
New flags
If a workflow is added or modified:
Backward compatibility
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you or @maxenglander create an issue? And it will be nice to have a before/after view of allocations from any of the benchmarks added to the PR description.
Hello! 👋 This Pull Request is now handled by arewefastyet. The current HEAD and future commits will be benchmarked. You can find the performance comparison on the arewefastyet website. |
Oops forgot to tag with the benchmark label. Will post results once they're available. |
Seems like the benchmarks on the base of this Pull Requests are failing. I will check what's going on in a moment. |
Update: I've managed to remove another allocation besides the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice improvement.
Description
While working on some enterprise customers, @maxenglander noticed a very significant amount of allocations from
readComQueryResponse
. I don't think this is the cause of the increased p99 latencies that the customer was seeing, but it is indeed wasteful and should be fixed.The allocations happen in
mysql.(*Conn).parseOKPacket
, and there are two sources:&PacketOK{}
instances are being created every time the function is called, because they're returned as a pointer value. This is the 64 byte allocation group seen in the profile, 2.7GB total.coder
struct that handles parsing of the packet is being allocated in the heap, because when the parsing fails (i.e. very rarely), the struct is passed as an%v
argument tovterrors.Errorf
. SinceErrorf
is a variadric API, its arguments are passed as interfaces (any
), which forces the object to be always moved to the heap upon first instantiation. This is the 32 byte allocation group seen in the profile, 1.3GB total.The fixes are as follows:
Since the
PacketOK
being returned from that function is only being used as a temporary, we can change the signature of the function to take the packet as an argument. Since we have a direct function call without interfaces in the way, this is good enough to keep the packet allocated on the stack in all the call-sites for the function.For the
coder
struct, simply change the error returns of the function to receivedata.data
instead of the whole struct. This causes the&coder{}
initialization at the start of the function to remain in the stack. Remember that the Go compiler doesn't necessarily place&var
constructions directly into the heap: they can be placed on the stack if they're found not to escape.As a result this PR removes the packet allocation for all queries (not only for queries without results), which should result in a measurable reduction in small allocations for all the benchmarks we're tracking. Furthermore, this applies anywhere where we're using the MySQL clients, not only in the tablets.
After the changes, the
parseOKPacket
API is gone from heap profiles, as it is now zero-allocation:readComQueryResponse
is also gone from the profile as it was callingparseOKPacket
indirectly, so now this API is also zero-allocation. Another 2.6GB of memory gone.On the
arewefast
benchmarks, we're once again seeing the long standing issue (which I'll eventually fix, I swear) where the changes are not directly comparable between PRs because when you reduce the memory allocations in Vitess, this results in GC changes that affect the throughput:Here you can see that there are very significant memory savings everywhere where we're using a MySQL connection, which also reduces the CPU usage for Vitess as a whole (Total CPU time spent is down), but that results in less QPS, like we saw when introducing the new connection pool in #14034.
Related Issue(s)
Checklist
Deployment Notes