Skip to content

4-unrolled.js has unnecessary allocations #1

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
vird opened this issue Apr 12, 2025 · 13 comments
Open

4-unrolled.js has unnecessary allocations #1

vird opened this issue Apr 12, 2025 · 13 comments

Comments

@vird
Copy link

vird commented Apr 12, 2025

FixedCircularBuffer allocates once.
UnrolledQueue allocates each time QueueNode overflows
Object pool for QueueNode is needed to reduce allocations

@tshemsedinov
Copy link
Member

you won't believe it, I'm preparing such an example, just now, but but with one spare buffer, not a pool

@tshemsedinov
Copy link
Member

@vird moreover, we can use list of nodes as a pool, to avoid creating new machinery for it, so list will have 3 refs: tail, current, head. We will enqueue to current. After removing from tail will move to head.

@vird
Copy link
Author

vird commented Apr 12, 2025

Object pool with linked list storage instead of array for me is more or less equivalent by effect, so I'm still calling it object pool.
Since we already have fields for linked list we indeed can reuse them

@tshemsedinov
Copy link
Member

@vird See new implementations: 5, 6, 7

@vird
Copy link
Author

vird commented Apr 15, 2025

7-current.js

  1. UnrolledQueue has logic mistake
    https://github.com/HowProgrammingWorks/Queue/blob/main/JavaScript/7-current.js#L50
#poolSize = 2;

const { nodeSize, poolSize } = options;
if (poolSize) this.#poolSize = poolSize;

// so if poolSize will be not in options this will fail
for (let i = 1; i < poolSize; i++) {

Same with nodeSize

  1. https://github.com/HowProgrammingWorks/Queue/blob/main/JavaScript/7-current.js#L68
    Should be ({size:this.#nodeSize})

all 5, 6, 7
3. QueueNode doesn't reset length (found by o3-mini-high)

  1. Huge performance penalty
    https://github.com/HowProgrammingWorks/Queue/blob/main/JavaScript/6-pool.js#L45
    instead of shift, pop should be used
    Reason 1. o(n) shift vs o(1) pop
    Reason 2. We pick more cold CPU cache instance with shift. We can pick more hot instance with pop. Shift will cause more intensive CPU cache fill/eviction (this is important for spare implementation in case of linked list)

  2. (Just style) 6-pool.js
    class Pool should have all fields private, since they are not used (and since we have private fields in general in code)

Similar for QueueNode (most fields can be marked as private)

  1. (Just style) 6-pool.js
    It's really hard to imagine adding to head and taking from tail

typical way to display list is head -> .. -> tail which cause this illustration
[0 1 2 ()] [4 5 6 7]
So actual position for insert is ()

  1. (optional) maybe spare implementation is far from effective. #spare can actually be linked list (so it will store multiple elements without need to have dedicated array as object pool has; if we want to control spare linked list length we can have +1 field). We already have next field in QueueNode

  2. (optional) nodeSize and poolSize have no protection from invalid values. e.g. for 7-current poolSize = 1 is probably invalid value. Core datastructures should throw exceptions as new Array(-1) does

  3. possible logic error in 7-current (noted by o3-mini-high)
    This linked list constructed as circular https://github.com/HowProgrammingWorks/Queue/blob/main/JavaScript/7-current.js#L54
    But when we call reset we actually break circular link
    https://github.com/HowProgrammingWorks/Queue/blob/main/JavaScript/7-current.js#L86
    Because reset makes this.next = null

this.#tail.next = node; this will only recover if this.#tail === this.#current

  1. (Just style) 7-current. For circular linked list do we need tail at all? We have only read node and write node. This will be much more understandable (currently we have head, tail and current)

tshemsedinov added a commit that referenced this issue Apr 15, 2025
Refs: #1
@tshemsedinov
Copy link
Member

all 5, 6, 7
3. QueueNode doesn't reset length (found by o3-mini-high)

In all listed implementations we call reset only when length already === 0

@tshemsedinov
Copy link
Member

(Just style) 6-pool.js
It's really hard to imagine adding to head and taking from tail

QueueNode and Pool are internal classes in this implementation, we use private fields just for external interface of Queue. But in https://github.com/metarhia/metautil we have Pool as a separate abstraction, so we will reuse it https://github.com/metarhia/metautil/blob/master/lib/pool.js

@tshemsedinov
Copy link
Member

(optional) maybe spare implementation is far from effective. #spare can actually be linked list (so it will store multiple elements without need to have dedicated array as object pool has; if we want to control spare linked list length we can have +1 field). We already have next field in QueueNode

QueueNode is big enough 2048 by default, in most cases we do not need pool, just single spare buffer of 2048 elements is ok, it is subject of performance tests to detect best solution for certain cases

@tshemsedinov
Copy link
Member

@vird gpt just can't understand combined logic of queue and pool, we use #tail as a pool

@tshemsedinov
Copy link
Member

┌──────────┬──────────┬───────────┐
│ (index)  │ cpu      │ memory    │
├──────────┼──────────┼───────────┤
│ queue    │ '576.54' │ '5524.62' │
│ naïve    │ '12.06'  │ '7620.13' │
│ fixed    │ '6.32'   │ '6045.08' │
│ unrolled │ '6.46'   │ '6045.98' │
│ spare    │ '6.72'   │ '5896.03' │
│ pool     │ '6.51'   │ '6029.93' │
│ circular │ '5.67'   │ '5510.35' │
│ nodiv    │ '4.45'   │ '4093.59' │
└──────────┴──────────┴───────────┘

@vird
Copy link
Author

vird commented Apr 16, 2025

My results differ a lot
AMD Ryzen 9 5900X 12-Core Processor
node v18.19.1

┌──────────┬───────────┬───────────┐
│ (index)  │    cpu    │  memory   │
├──────────┼───────────┼───────────┤
│  queue   │ '9055.41' │ '4980.80' │
│  naïve   │  '21.13'  │ '7510.66' │
│  fixed   │  '6.23'   │ '4794.04' │
│ unrolled │  '10.55'  │ '4560.13' │
│  spare   │  '9.16'   │ '5081.48' │
│   pool   │  '12.10'  │ '5040.23' │
│ circular │  '11.82'  │ '4906.48' │
│  nodiv   │  '14.35'  │ '4413.55' │
└──────────┴───────────┴───────────┘

node v22.13.1

┌──────────┬───────────┬───────────┐
│ (index)  │ cpu       │ memory    │
├──────────┼───────────┼───────────┤
│ queue    │ '6181.56' │ '4404.51' │
│ naïve    │ '12.12'   │ '7651.69' │
│ fixed    │ '12.99'   │ '4668.91' │
│ unrolled │ '8.47'    │ '6049.08' │
│ spare    │ '7.85'    │ '6051.26' │
│ pool     │ '6.84'    │ '6045.61' │
│ circular │ '5.38'    │ '5503.26' │
│ nodiv    │ '4.44'    │ '4101.52' │
└──────────┴───────────┴───────────┘

@vird
Copy link
Author

vird commented Apr 17, 2025

And I disagree about benchmark methodology, so I write my own
https://github.com/vird/Queue/tree/fix/perf_test
Depends on run naïve and fixed are fastest
node v22.13.1

queue x 2.06 ops/sec ±0.58% (6 runs sampled)
naïve x 854 ops/sec ±0.74% (23 runs sampled)
fixed x 720 ops/sec ±1.41% (21 runs sampled)
unrolled x 293 ops/sec ±6.36% (9 runs sampled)
spare x 308 ops/sec ±0.90% (18 runs sampled)
pool x 305 ops/sec ±6.25% (13 runs sampled)
circular x 354 ops/sec ±19.03% (15 runs sampled)
nodiv x 662 ops/sec ±15.41% (14 runs sampled)

=== Summary (cpu ms | memory KB) ===
┌──────────┬───────────────┬────────────┐
│ (index)  │ cpu           │ memory     │
├──────────┼───────────────┼────────────┤
│ queue    │ '48465281.03' │ '12865.43' │
│ naïve    │ '117107.85'   │ '16304.63' │
│ fixed    │ '138920.89'   │ '14564.13' │
│ unrolled │ '341349.03'   │ '9001.40'  │
│ spare    │ '324801.46'   │ '14031.10' │
│ pool     │ '328184.27'   │ '9299.03'  │
│ circular │ '282868.64'   │ '13659.18' │
│ nodiv    │ '151062.63'   │ '5752.83'  │
└──────────┴───────────────┴────────────┘

queue x 2.04 ops/sec ±3.39% (6 runs sampled)
naïve x 672 ops/sec ±1.29% (23 runs sampled)
fixed x 741 ops/sec ±1.60% (21 runs sampled)
unrolled x 295 ops/sec ±7.16% (8 runs sampled)
spare x 296 ops/sec ±3.11% (17 runs sampled)
pool x 281 ops/sec ±2.79% (13 runs sampled)
circular x 286 ops/sec ±7.93% (14 runs sampled)
nodiv x 309 ops/sec ±2.74% (14 runs sampled)

=== Summary (cpu ms | memory KB) ===
┌──────────┬───────────────┬────────────┐
│ (index)  │ cpu           │ memory     │
├──────────┼───────────────┼────────────┤
│ queue    │ '48977106.12' │ '12865.09' │
│ naïve    │ '148710.27'   │ '9423.12'  │
│ fixed    │ '134943.70'   │ '14673.28' │
│ unrolled │ '338981.28'   │ '5066.77'  │
│ spare    │ '338388.98'   │ '5777.31'  │
│ pool     │ '355819.44'   │ '5272.06'  │
│ circular │ '349410.83'   │ '5347.34'  │
│ nodiv    │ '324148.15'   │ '5728.86'  │
└──────────┴───────────────┴────────────┘

@vird
Copy link
Author

vird commented Apr 17, 2025

https://github.com/vird/Queue/tree/feat/inlined

node v22.13.1
legacy test

┌──────────┬───────────┬───────────┐
│ (index)  │ cpu       │ memory    │
├──────────┼───────────┼───────────┤
│ queue    │ '6278.34' │ '4431.48' │
│ naïve    │ '14.72'   │ '7503.29' │
│ fixed    │ '6.40'    │ '6040.44' │
│ unrolled │ '6.76'    │ '6043.45' │
│ spare    │ '6.34'    │ '6053.88' │
│ pool     │ '6.89'    │ '6045.53' │
│ circular │ '6.81'    │ '6048.52' │
│ nodiv    │ '6.12'    │ '5510.28' │
│ inlined  │ '3.58'    │ '4087.48' │
└──────────┴───────────┴───────────┘

new test

queue x 2.10 ops/sec ±1.24% (6 runs sampled)
naïve x 862 ops/sec ±0.27% (23 runs sampled)
fixed x 729 ops/sec ±1.15% (21 runs sampled)
unrolled x 316 ops/sec ±12.30% (9 runs sampled)
spare x 319 ops/sec ±2.10% (17 runs sampled)
pool x 303 ops/sec ±6.34% (13 runs sampled)
circular x 282 ops/sec ±8.05% (14 runs sampled)
nodiv x 307 ops/sec ±3.54% (13 runs sampled)
inlined x 653 ops/sec ±0.66% (22 runs sampled)

=== Summary (cpu ms | memory KB) ===
┌──────────┬───────────────┬────────────┐
│ (index)  │ cpu           │ memory     │
├──────────┼───────────────┼────────────┤
│ queue    │ '47585987.25' │ '12865.28' │
│ naïve    │ '116072.62'   │ '9272.27'  │
│ fixed    │ '137182.61'   │ '14556.46' │
│ unrolled │ '316107.61'   │ '9001.09'  │
│ spare    │ '313103.08'   │ '10003.45' │
│ pool     │ '330116.06'   │ '13326.30' │
│ circular │ '355236.53'   │ '9375.36'  │
│ nodiv    │ '325385.82'   │ '13308.14' │
│ inlined  │ '153253.46'   │ '10866.02' │
└──────────┴───────────────┴────────────┘

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants