Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encoding: eliminate unnecessary allocations #5655

Merged
merged 3 commits into from
Aug 29, 2023

Conversation

algorandskiy
Copy link
Contributor

@algorandskiy algorandskiy commented Aug 10, 2023

Summary

While working on reducing memory requirements I found about 12% of allocations come from protocol.PutEncodingBuf( see the picture below). This happens when a slice object (pointer to data, len, cap) gets reallocated to the heap.

image

Fix is to wrap a byte slice in struct and make pool.New() to return a pointer to it. In this case the wrapping struct is allocated on heap, so returning pointer into a pool fixes the issue.

I also noticed that majority of the calls come from Transaction.ID and made a separate pool for it (commit 1).

As a result total number of allocations decreased by 10% 673713259 (master) vs 609152614 (feature)

Test Plan

Run bunch of local tests (2 nodes, target tps=3000) for master vs commit 1 (separate pool for transaction id) vs the entire PR (global protocol.PutEncodingBuf fix and transaction id pool).

Number of allocations

commit 1 PutEncodingBuf allocations (this is 20m run, so absolute numbers are different):
image

full PR (feature)PutEncodingBuf allocations (all gone):
image

Memory charts for 20m run:

master commit 1 feature
image image image

Block sizes / TPS

master commit 1 feature
image image note the released amount is lower indicating less mem usage/less allocations (having the same load)

TODO: explain TPS bump at the beginning and drop after some rounds.

zeldovich
zeldovich previously approved these changes Aug 10, 2023
@codecov
Copy link

codecov bot commented Aug 10, 2023

Codecov Report

Merging #5655 (2cfde98) into master (ca420de) will decrease coverage by 0.01%.
The diff coverage is 48.48%.

@@            Coverage Diff             @@
##           master    #5655      +/-   ##
==========================================
- Coverage   55.21%   55.20%   -0.01%     
==========================================
  Files         473      473              
  Lines       66156    66193      +37     
==========================================
+ Hits        36528    36544      +16     
- Misses      27164    27179      +15     
- Partials     2464     2470       +6     
Files Changed Coverage Δ
protocol/codec.go 56.25% <0.00%> (-3.75%) ⬇️
data/transactions/signedtxn.go 33.33% <50.00%> (+1.75%) ⬆️
data/transactions/transaction.go 44.62% <53.12%> (+1.51%) ⬆️
...merklesignature/persistentMerkleSignatureScheme.go 67.94% <100.00%> (+0.41%) ⬆️
data/transactions/teal.go 82.60% <100.00%> (+1.65%) ⬆️

... and 11 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

Copy link
Contributor

@iansuvak iansuvak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes LGTM code-wise but the abruptness and flatness of the DPS drop are concerning.

Also if we are already

data/transactions/transaction.go Show resolved Hide resolved
Copy link
Contributor

@bbroder-algo bbroder-algo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

really nice work.

@AlgoAxel
Copy link
Contributor

You mention having a TODO to investigate the spike in TPS at the start of your experiment. Just wanted to note that the spike looks related to the bimodal histograms of TPS and Tx/Block. So seems like maybe the blocks at the beginning of your experiment are extra large, potentially due to pingpong doing setup?

@algorandskiy
Copy link
Contributor Author

So seems like maybe the blocks at the beginning of your experiment are extra large, potentially due to pingpong doing setup?

No these blocks are excluded. Plus, there are only 500 accounts => 500 funding transactions, cannot be responsible for this spike.

@AlgoAxel
Copy link
Contributor

I can approve this, as seems the overall performance is unaffected, investigation of the start of the pingpong run shouldn't need to block this.

However, Note that you have ReviewDog errors on exported functions without comments.

@algorandskiy
Copy link
Contributor Author

Rebased / fixed CR comments.

@algorandskiy algorandskiy merged commit 10f2f55 into algorand:master Aug 29, 2023
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants