-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory: Use triomphe::Arc
for SharedReference
#8622
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The latest updates on your projects. Learn more about Vercel for Git ↗︎
8 Ignored Deployments
|
This stack of pull requests is managed by Graphite. Learn more about stacking. |
🟢 Turbopack Benchmark CI successful 🟢Thanks |
|
✅ This change can build |
This eliminates the unused weakref count in `std::sync::Arc`, saving 64 bits per unique `SharedReference`. I noticed there's additional potential optimizations we could do here too (not included in this PR), of increasing levels of difficulty: - We can deduplicate `ValueTypeId` by moving it inside the `Arc`. - We can build a mapping of `std::any::TypeId` to `ValueTypeId`, and avoid storing the `ValueTypeId` entirely. - We can deduplicate the fat pointer's layout metadata by also storing it inside the `Arc` using the nightly `ptr_metadata` feature, similar to `triomphe::ThinArc` (but that only works for slices of non-dst elements, presumably because they don't want to depend on nightly).
540d318
to
d8a54b2
Compare
ForsakenHarmony
approved these changes
Jun 28, 2024
bgw
added a commit
to vercel/next.js
that referenced
this pull request
Jul 3, 2024
Tobias Koppers - fix typo (vercel/turborepo#8619) Benjamin Woodruff - Store aggregate read/execute count statistics (vercel/turborepo#8286) Tobias Koppers - box InProgress task state (vercel/turborepo#8644) Tobias Koppers - Task Edges Set/List (vercel/turborepo#8624) Benjamin Woodruff - Memory: Use `triomphe::Arc` for `SharedReference` (vercel/turborepo#8622) Will Binns-Smith - chore: release npm packages (vercel/turborepo#8614) Will Binns-Smith - devlow-bench: add git branch and sha to datapoints (vercel/turborepo#8602) --- Fixes a `triomphe` package version conflict between turbopack and swc by bumping it from 0.1.11 to 0.1.13.
ForsakenHarmony
pushed a commit
to vercel/next.js
that referenced
this pull request
Jul 25, 2024
) ### Description This eliminates the unused weakref count in `std::sync::Arc`, saving 64 bits per unique `SharedReference`. I noticed there's additional potential optimizations we could do here too (not included in this PR), of increasing levels of difficulty: - We can deduplicate `ValueTypeId` by moving it inside the `Arc`. - We can build a mapping of `std::any::TypeId` to `ValueTypeId`, and avoid storing the `ValueTypeId` entirely. - We can deduplicate the fat pointer's layout metadata by also storing it inside the `Arc` using the nightly `ptr_metadata` feature, similar to `triomphe::ThinArc` (but that only works for slices of non-dst elements, presumably because they don't want to depend on nightly). ### Testing Instructions Using dhat for measuring the max heap size (vercel/next.js/67166)... Do a release build ``` PACK_NEXT_COMPRESS=objcopy-zstd pnpm pack-next --release --features __internal_dhat-heap ``` Start the dev server in shadcn-ui, and try to load the homepage: ``` pnpm i pnpm --filter=www dev --turbo curl http://localhost:3003/ ``` After curl exits, kill the dev server. ``` pkill -INT next-server ``` **Before (2 Runs):** ``` [dhat]: Teardown profiler dhat: Total: 3,106,545,900 bytes in 20,113,580 blocks dhat: At t-gmax: 731,860,693 bytes in 4,924,028 blocks dhat: At t-end: 731,860,693 bytes in 4,924,028 blocks ``` ``` [dhat]: Teardown profiler dhat: Total: 3,108,036,858 bytes in 20,111,694 blocks dhat: At t-gmax: 730,059,664 bytes in 4,923,002 blocks dhat: At t-end: 730,059,536 bytes in 4,923,001 blocks ``` **After (2 Runs):** ``` [dhat]: Teardown profiler dhat: Total: 3,093,298,170 bytes in 20,127,818 blocks dhat: At t-gmax: 727,258,939 bytes in 4,923,863 blocks dhat: At t-end: 727,155,146 bytes in 4,923,901 blocks ``` ``` [dhat]: Teardown profiler dhat: Total: 3,102,661,690 bytes in 20,124,408 blocks dhat: At t-gmax: 728,190,876 bytes in 4,924,856 blocks dhat: At t-end: 728,124,976 bytes in 4,924,236 blocks ``` This is [a 0.4426% reduction](https://www.wolframalpha.com/input?i=Percent+change+from+%28731%2C860%2C693%2B730%2C059%2C664%29%2F2+to+%28727%2C258%2C939%2B728%2C190%2C876%29%2F2).
ForsakenHarmony
pushed a commit
to vercel/next.js
that referenced
this pull request
Jul 29, 2024
) ### Description This eliminates the unused weakref count in `std::sync::Arc`, saving 64 bits per unique `SharedReference`. I noticed there's additional potential optimizations we could do here too (not included in this PR), of increasing levels of difficulty: - We can deduplicate `ValueTypeId` by moving it inside the `Arc`. - We can build a mapping of `std::any::TypeId` to `ValueTypeId`, and avoid storing the `ValueTypeId` entirely. - We can deduplicate the fat pointer's layout metadata by also storing it inside the `Arc` using the nightly `ptr_metadata` feature, similar to `triomphe::ThinArc` (but that only works for slices of non-dst elements, presumably because they don't want to depend on nightly). ### Testing Instructions Using dhat for measuring the max heap size (vercel/next.js/67166)... Do a release build ``` PACK_NEXT_COMPRESS=objcopy-zstd pnpm pack-next --release --features __internal_dhat-heap ``` Start the dev server in shadcn-ui, and try to load the homepage: ``` pnpm i pnpm --filter=www dev --turbo curl http://localhost:3003/ ``` After curl exits, kill the dev server. ``` pkill -INT next-server ``` **Before (2 Runs):** ``` [dhat]: Teardown profiler dhat: Total: 3,106,545,900 bytes in 20,113,580 blocks dhat: At t-gmax: 731,860,693 bytes in 4,924,028 blocks dhat: At t-end: 731,860,693 bytes in 4,924,028 blocks ``` ``` [dhat]: Teardown profiler dhat: Total: 3,108,036,858 bytes in 20,111,694 blocks dhat: At t-gmax: 730,059,664 bytes in 4,923,002 blocks dhat: At t-end: 730,059,536 bytes in 4,923,001 blocks ``` **After (2 Runs):** ``` [dhat]: Teardown profiler dhat: Total: 3,093,298,170 bytes in 20,127,818 blocks dhat: At t-gmax: 727,258,939 bytes in 4,923,863 blocks dhat: At t-end: 727,155,146 bytes in 4,923,901 blocks ``` ``` [dhat]: Teardown profiler dhat: Total: 3,102,661,690 bytes in 20,124,408 blocks dhat: At t-gmax: 728,190,876 bytes in 4,924,856 blocks dhat: At t-end: 728,124,976 bytes in 4,924,236 blocks ``` This is [a 0.4426% reduction](https://www.wolframalpha.com/input?i=Percent+change+from+%28731%2C860%2C693%2B730%2C059%2C664%29%2F2+to+%28727%2C258%2C939%2B728%2C190%2C876%29%2F2).
ForsakenHarmony
pushed a commit
to vercel/next.js
that referenced
this pull request
Jul 29, 2024
) ### Description This eliminates the unused weakref count in `std::sync::Arc`, saving 64 bits per unique `SharedReference`. I noticed there's additional potential optimizations we could do here too (not included in this PR), of increasing levels of difficulty: - We can deduplicate `ValueTypeId` by moving it inside the `Arc`. - We can build a mapping of `std::any::TypeId` to `ValueTypeId`, and avoid storing the `ValueTypeId` entirely. - We can deduplicate the fat pointer's layout metadata by also storing it inside the `Arc` using the nightly `ptr_metadata` feature, similar to `triomphe::ThinArc` (but that only works for slices of non-dst elements, presumably because they don't want to depend on nightly). ### Testing Instructions Using dhat for measuring the max heap size (vercel/next.js/67166)... Do a release build ``` PACK_NEXT_COMPRESS=objcopy-zstd pnpm pack-next --release --features __internal_dhat-heap ``` Start the dev server in shadcn-ui, and try to load the homepage: ``` pnpm i pnpm --filter=www dev --turbo curl http://localhost:3003/ ``` After curl exits, kill the dev server. ``` pkill -INT next-server ``` **Before (2 Runs):** ``` [dhat]: Teardown profiler dhat: Total: 3,106,545,900 bytes in 20,113,580 blocks dhat: At t-gmax: 731,860,693 bytes in 4,924,028 blocks dhat: At t-end: 731,860,693 bytes in 4,924,028 blocks ``` ``` [dhat]: Teardown profiler dhat: Total: 3,108,036,858 bytes in 20,111,694 blocks dhat: At t-gmax: 730,059,664 bytes in 4,923,002 blocks dhat: At t-end: 730,059,536 bytes in 4,923,001 blocks ``` **After (2 Runs):** ``` [dhat]: Teardown profiler dhat: Total: 3,093,298,170 bytes in 20,127,818 blocks dhat: At t-gmax: 727,258,939 bytes in 4,923,863 blocks dhat: At t-end: 727,155,146 bytes in 4,923,901 blocks ``` ``` [dhat]: Teardown profiler dhat: Total: 3,102,661,690 bytes in 20,124,408 blocks dhat: At t-gmax: 728,190,876 bytes in 4,924,856 blocks dhat: At t-end: 728,124,976 bytes in 4,924,236 blocks ``` This is [a 0.4426% reduction](https://www.wolframalpha.com/input?i=Percent+change+from+%28731%2C860%2C693%2B730%2C059%2C664%29%2F2+to+%28727%2C258%2C939%2B728%2C190%2C876%29%2F2).
ForsakenHarmony
pushed a commit
to vercel/next.js
that referenced
this pull request
Aug 1, 2024
) ### Description This eliminates the unused weakref count in `std::sync::Arc`, saving 64 bits per unique `SharedReference`. I noticed there's additional potential optimizations we could do here too (not included in this PR), of increasing levels of difficulty: - We can deduplicate `ValueTypeId` by moving it inside the `Arc`. - We can build a mapping of `std::any::TypeId` to `ValueTypeId`, and avoid storing the `ValueTypeId` entirely. - We can deduplicate the fat pointer's layout metadata by also storing it inside the `Arc` using the nightly `ptr_metadata` feature, similar to `triomphe::ThinArc` (but that only works for slices of non-dst elements, presumably because they don't want to depend on nightly). ### Testing Instructions Using dhat for measuring the max heap size (vercel/next.js/67166)... Do a release build ``` PACK_NEXT_COMPRESS=objcopy-zstd pnpm pack-next --release --features __internal_dhat-heap ``` Start the dev server in shadcn-ui, and try to load the homepage: ``` pnpm i pnpm --filter=www dev --turbo curl http://localhost:3003/ ``` After curl exits, kill the dev server. ``` pkill -INT next-server ``` **Before (2 Runs):** ``` [dhat]: Teardown profiler dhat: Total: 3,106,545,900 bytes in 20,113,580 blocks dhat: At t-gmax: 731,860,693 bytes in 4,924,028 blocks dhat: At t-end: 731,860,693 bytes in 4,924,028 blocks ``` ``` [dhat]: Teardown profiler dhat: Total: 3,108,036,858 bytes in 20,111,694 blocks dhat: At t-gmax: 730,059,664 bytes in 4,923,002 blocks dhat: At t-end: 730,059,536 bytes in 4,923,001 blocks ``` **After (2 Runs):** ``` [dhat]: Teardown profiler dhat: Total: 3,093,298,170 bytes in 20,127,818 blocks dhat: At t-gmax: 727,258,939 bytes in 4,923,863 blocks dhat: At t-end: 727,155,146 bytes in 4,923,901 blocks ``` ``` [dhat]: Teardown profiler dhat: Total: 3,102,661,690 bytes in 20,124,408 blocks dhat: At t-gmax: 728,190,876 bytes in 4,924,856 blocks dhat: At t-end: 728,124,976 bytes in 4,924,236 blocks ``` This is [a 0.4426% reduction](https://www.wolframalpha.com/input?i=Percent+change+from+%28731%2C860%2C693%2B730%2C059%2C664%29%2F2+to+%28727%2C258%2C939%2B728%2C190%2C876%29%2F2).
ForsakenHarmony
pushed a commit
to vercel/next.js
that referenced
this pull request
Aug 16, 2024
Tobias Koppers - fix typo (vercel/turborepo#8619) Benjamin Woodruff - Store aggregate read/execute count statistics (vercel/turborepo#8286) Tobias Koppers - box InProgress task state (vercel/turborepo#8644) Tobias Koppers - Task Edges Set/List (vercel/turborepo#8624) Benjamin Woodruff - Memory: Use `triomphe::Arc` for `SharedReference` (vercel/turborepo#8622) Will Binns-Smith - chore: release npm packages (vercel/turborepo#8614) Will Binns-Smith - devlow-bench: add git branch and sha to datapoints (vercel/turborepo#8602) --- Fixes a `triomphe` package version conflict between turbopack and swc by bumping it from 0.1.11 to 0.1.13.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
This eliminates the unused weakref count in
std::sync::Arc
, saving 64 bits per uniqueSharedReference
.I noticed there's additional potential optimizations we could do here too (not included in this PR), of increasing levels of difficulty:
We can deduplicate
ValueTypeId
by moving it inside theArc
.We can build a mapping of
std::any::TypeId
toValueTypeId
, and avoid storing theValueTypeId
entirely.We can deduplicate the fat pointer's layout metadata by also storing it inside the
Arc
using the nightlyptr_metadata
feature, similar totriomphe::ThinArc
(but that only works for slices of non-dst elements, presumably because they don't want to depend on nightly).Testing Instructions
Using dhat for measuring the max heap size (vercel/next.js#67166)...
Do a release build
Start the dev server in shadcn-ui, and try to load the homepage:
After curl exits, kill the dev server.
Before (2 Runs):
After (2 Runs):
This is a 0.4426% reduction.