Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

strmxor needs 40GB of memory to xor a 2500x2500 hole fence #1636

Closed
stefanottili opened this issue Feb 29, 2024 · 6 comments
Closed

strmxor needs 40GB of memory to xor a 2500x2500 hole fence #1636

stefanottili opened this issue Feb 29, 2024 · 6 comments
Labels

Comments

@stefanottili
Copy link

stefanottili commented Feb 29, 2024

While testing worst case scenario's for scanlines I noticed that strmxor uses a lot of memory when encountering "fence" data.
Input data which causes results with lot's of holes have been seen in the wild as either cad layers in ram/rom generators or as shielding for analog blocks.
This is just a "rare corner case" performance issue report, given enough memory strmxor does the right thing.

The attached fence with 2500x2500 holes requires up to 40GB of memory, even when xor'ing itself, not generating any output. One horizontal, one vertical array with 2500 row/col, referencing a path.

strmxor --debug-level=21 --threads=8 --heal --tiles=500 test_xor_fence.oas test_xor_fence.oas xor.oas

strmxor ... --layer-details test_xor_fence.oas empty.oas xor.empty.oas
creates a single polygon with a large number of edges.

xor.empty.gds breaks the polygon into smaller pieces, but klayout throws
Warning: Record length larger than 0x8000 encountered: interpreting as unsigned (position=118, record number=9, cell=XOR)
since the --max-vertex-count default is 8190, a smaller value get's rid of this warning.

It would be nice if there would be an oas equivalent to the gds specific --max-vertex-count
test_xor_fence.oas.gz
empty.oas.gz

@stefanottili
Copy link
Author

One strategy to avoid excessive memory consumption is to monitor the scanline memory requirements.
If they exceed a certain threshold, stop operation and divide the chip/tile recursively into smaller pieces til they're small enough to run.
And in case the output polygons have "too many" edges, switch off healing.

@klayoutmatthias
Copy link
Collaborator

I will try to find a solution, but a need a little time. It will not make it into 0.29.

@stefanottili
Copy link
Author

Given enough memory, klayout does the right thing, the problematic case is so unusual that just skipping xor'ing this particular layer is good enough, see #1668

But in order for that to be useful, strmxor and klayout would have to change their "foreach tile, foreach layer" processing to "foreach layer, foreach tile", so that a user get's feedback which layer is causing xor to take a lot of time/memory.

@stefanottili
Copy link
Author

The fix for #1853 did wonders for this testcase.

strmxor finished in 30isch seconds and only needed 3GB of memory at peak.

@klayoutmatthias
Copy link
Collaborator

Wow ... I honestly do not know how this is related, but I am fine with that :)

The fix is some heuristics to generate a cleaner scanline when doing the hole insertion. That shouldn't even create more vertexes. I ran a number of testcases to see the impact and got an inconsistent picture without significant improvements (+/- 5% in runtime, no visible memory effect).

I rather think that moving from std::deque to std::list does the trick. I do not really understand how std::deque is implemented in the different STL and it appears to be a second class citizen there. Maybe it is subject to memory fragmentation in some implementations and that causes the trouble here.

Anyway, a fixed bug is a good bug :) Thanks for letting me know.

Matthias

@klayoutmatthias
Copy link
Collaborator

It's going to be slightly slower again, when I address issue #1839.

In my case the runtime increases from 22s to 25s without notable memory impact for the hole fence XOR example (branch is https://github.com/KLayout/klayout/tree/bugfix/issue-1839). I hope that is an acceptable compromise.

Matthias

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants