[red-knot] Use itertools to clean up `SymbolState::merge` #15702

dcreager · 2025-01-23T21:58:07Z

merge_join_by handles the "merge two sorted iterators" bit, and izip handles iterating through the bindings/definitions along with their associated constraints.

carljm · 2025-01-23T22:00:46Z

I think you need @MichaReiser review on this PR :) Both because he knows Rust a lot better than I do, and because I'm aware that he has an aversion to itertools, because it has iterators that silently allocate, and arguably this is bad because allocations should be more visible than that.

carljm · 2025-01-23T23:22:46Z

Thanks for this!

This definitely looks simpler and easier to understand; having the "merge-join-by" logic extracted as a utility is nice to separate it from the red-knot-specific logic. I'm also not sure how much to weight that here, since this is core infra that shouldn't have to change too often; eking out the best performance we can might be higher priority.

On that note, CodSpeed suggests that this is a 2% regression on "cold" check (which is where I'd expect semantic indexing cost to be relevant, as opposed to "incremental" check where Salsa validation costs usually dominate): https://codspeed.io/astral-sh/ruff/branches/dcreager%2Fmerge-cleanup

I'm not sure where that regression would be coming from, exactly; CodSpeed flame graph suggests it is all coming from a salsa ingredient get_or_create method, which doesn't make a lot of sense, so maybe it's just noise? It does seem like there would be some loss of efficiency here in that we can no longer reuse one of the declarations bitsets (via union-in-place), but instead we start from zero and add all declarations? That maybe isn't likely to make much difference?

Like I said above, I'd like to hear what @MichaReiser thinks.

carljm · 2025-01-23T23:23:34Z

Might be worth running that cold benchmark a few times locally with and without this change, just to see if we can get a sense of how stable vs noisy that regression is?

dcreager · 2025-01-24T00:22:06Z

Might be worth running that cold benchmark a few times locally with and without this change, just to see if we can get a sense of how stable vs noisy that regression is?

Can do

we can no longer reuse one of the declarations bitsets (via union-in-place)

I can also try a version that adds back the union in-place. The union itself was OR-ing block by block, instead of bit by bit, and also had a reserve step to make sure there was at most 1 (re)allocation. Interestingly, we were using union for SymbolDeclarations, but not for SymbolBindings!

Use itertools to clean up SymbolState::merge

f16539a

dcreager added internal An internal refactor or improvement red-knot Multi-file analysis & type inference labels Jan 23, 2025

dcreager requested review from carljm, MichaReiser, AlexWaygood and sharkdp as code owners January 23, 2025 21:58

Use in-place union again

a342d69

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[red-knot] Use itertools to clean up `SymbolState::merge` #15702

[red-knot] Use itertools to clean up `SymbolState::merge` #15702

dcreager commented Jan 23, 2025

carljm commented Jan 23, 2025

carljm commented Jan 23, 2025

carljm commented Jan 23, 2025

dcreager commented Jan 24, 2025

[red-knot] Use itertools to clean up SymbolState::merge #15702

Are you sure you want to change the base?

[red-knot] Use itertools to clean up SymbolState::merge #15702

Conversation

dcreager commented Jan 23, 2025

carljm commented Jan 23, 2025

carljm commented Jan 23, 2025

carljm commented Jan 23, 2025

dcreager commented Jan 24, 2025

[red-knot] Use itertools to clean up `SymbolState::merge` #15702

[red-knot] Use itertools to clean up `SymbolState::merge` #15702