-
-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Game of Life Optimizations #181
Conversation
Uses more memory to achieve much higher framerates on large setups. Neighbor counts are stored instead of constantly recalculated. CRC is no longer used for repeat detection so false positives are no longer possible.
Use defined(ARDUINO_ARCH_ESP32) getNeighborIndexes loop changes offsets use int8_t change prevRows/Cols to uint16
bool allColors = SEGMENT.check1; | ||
bool overlayBG = SEGMENT.check2; | ||
bool wrap = SEGMENT.check3; | ||
bool bgBlendMode = SEGMENT.custom1 > 220 && !overlayBG; // if blur is high and not overlaying, use bg blend mode | ||
byte blur = bgBlendMode ? map2(SEGMENT.custom1 - 220, 0, 35, 255, 128) : map2(SEGMENT.custom1, 0, 255, 255, 0); | ||
byte blur = overlayBG ? 255 : bgBlendMode ? map2(SEGMENT.custom1 - 220, 0, 35, 255, 128) : map2(SEGMENT.custom1, 0, 220, 255, 10); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this might be the source of your problem with color_blend
🤔
map is allowed to "overshoot", so when you have an input outside the range, you'll also get an out-of-range result.
this might happen if overlayBG is true and custom1 > 220.
--> map2 produces something thats not in [0...255] and then casting to byte does "result & 0xFF".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made a quick test to show the results using this code:
static uint32_t prevColor1, prevColor2;
static int prevBlend;
uint32_t colorTest = SEGMENT.color_from_palette(0, false, PALETTE_SOLID_WRAP, 0);
if (prevColor1 != colorTest || prevColor2 != bgColor || prevBlend != SEGMENT.custom1) {
prevColor1 = colorTest;
prevColor2 = bgColor;
prevBlend = SEGMENT.custom1;
printf("Blending C1: %d C2: %d Blend: %d\n", colorTest, bgColor, SEGMENT.custom1);
uint32_t result = colorTest;
for (int i = 0; i < 255; i++) {
result = color_blend(result, prevColor2, SEGMENT.custom1);
printf("Color %2d: %12d -> %d\n", i, result, bgColor);
if (result == bgColor) break;
}
}
Test blending blue into black:
Blending C1: 255 C2: 0 Blend: 128
Color 0: 126 -> 0
Color 1: 62 -> 0
Color 2: 30 -> 0
Color 3: 14 -> 0
Color 4: 6 -> 0
Color 5: 2 -> 0
Color 6: 0 -> 0
Test blending blue into white:
Blending C1: 255 C2: 16777215 Blend: 128
Color 0: 8355838 -> 16777215
Color 1: 12500733 -> 16777215
Color 2: 14540285 -> 16777215
Color 3: 15592957 -> 16777215
Color 4: 16119293 -> 16777215
Color 5: 16382461 -> 16777215
Color 6: 16514045 -> 16777215
Color 7: 16579837 -> 16777215
Color 8: 16579837 -> 16777215
Color 9: 16579837 -> 16777215
Color 10: 16579837 -> 16777215
Into white it gets stuck on 16579837 and never fully blends into white. My fix was to check if color1 changed after blending, if it doesn't just snap directly to color2.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Brandon502 indeed this looks unexpected.
I'll cross-check in the next days, maybe you found something.
Thanks 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Brandon502 looks like our color_blend - and the same code from upstream - has major rounding errors.
We seem to use an old version of the function that's was taken from fastLed
https://github.com/FastLED/FastLED/blob/6d913bccd5a2cfd15a87115f26dd2ecdd7cab92f/src/colorutils.cpp#L287
"Old blend method which unfortunately had some rounding errors"
FastLed "blend8" was corrected in the meantime
https://github.com/FastLED/FastLED/blob/6d913bccd5a2cfd15a87115f26dd2ecdd7cab92f/src/lib8tion/math8.h#L592-L596
So one option would be to exchange the homebrew code in color_blend with r3 = blend8(r1, r2, blend);
(same for g,b,w).
I'll check tomorrow if using the blend8()
function has any negative impact on performance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Brandon502 I have updated color_blend, basicially copying the logic from FastLED blend8().
I've done some unit tests to confirm the new function is more accurate, and better than the old one for 8bit blends.
Could you repeat your own tests, and also check if the workaround in GOL is still needed?
(upstream still has the buggy blend function, so maybe just comment out your workaround if not needed any more in MM)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@softhack007 New blend function does seem more accurate, but it still doesn't always convert to color2 depending on the colors. So the fix still needed.
Removed future status/neighbors. Uses 2 loops to set cells. Shifting from future to current no longer needed.
@Brandon502 looks good - tested on 128x64 and the effect achieves 80-120 fps depending on number of active cells. Tested OK, neither clang-tidy nor cppcheck had any major comments on code quality. Just tell me once you're happy with the changes, and i'll merge into mdev prior to our upcoming release. |
Use superDead correctly with bgBlendMode
Yeah 1 byte is pretty similar performance after optimizing it so I agree it should be used. Just pushed one last change speeding up games that use bgBlendMode. The only bug currently is when reverse or transpose (square grids) is toggled all colors are lost. Not a huge deal, but depending on the background color it can blank the display until a new game starts. |
Hi @Brandon502, @troyhacks has seen delays that bring his new Art-Net driver out of sync when a new grid is generated. It looks like esp_random() is causing the slowdown - replacing it with the line for 8266 helps. A few other options that come to my mind |
Yeah, it's mostly just delays in how long it takes to init the cells at startup. This also seems reasonably fast:
|
Currently the pauses are enough to cause my external Art-Net controllers to lose sync for a moment... and also big pauses before it starts up in general: Whatsapp.Video.2024-11-18.At.8.03.11.Am.mp4 |
@softhack007 @troyhacks I tested out all your options and they all seemed better than just using random16(). I put the two most similar to the original below. But any of your options are fine with me. The shifting method is neat, but when I was testing it, sometimes it seemed like the alive chance shot up to ~95% chance and lit up most of the grid which dies instantly. Add entropy:
Use rand():
|
@Brandon502 thanks for your quick response:-) Actually the "Add entropy" solution seems better to me since it does not only rely on software pseudo randomness. @troyhacks would be good to know if both solutions keep your art-net hardware in sync? |
Both were equally as fast to my eye, no issues with startup delay or Art-Net output. Pushed the one @softhack007 picked via #189 |
Game of Life was struggling on large setups, changed the algorithm to get much faster update speeds at the cost of memory. 2 bytes per cell instead of 2 bits. Previously on my 128x64 panel I could get around 35fps max usually much lower depending on how many alive cells. Now can get well over 100 and averages around 80fps with blur on and 100+ without blur. @troyhacks also tested an previous version of this and saw significant improvements on his 192x96 display.
CRC16 is no longer used and can maybe be removed. On large setups CRC had quite a few false positive triggers, so I stored previous status in the new struct since I had plenty of extra bits free.
color_blend doesn't always blend color1 completely to color2, not sure if this is intended, but I used this to fix it, since I needed it for speed improvements.
To set the initial cells alive I noticed random16() seemed to produce long vertical lines pretty frequently. I switched to esp_random() for esp32s and it seems much better. This change can be reverted if needed.
If mirror or transposed is toggled a new game starts. Cell struct stores neighbor counts and if it is an edge cell. Mirror/Transpose break these values, you could either use more code to recalculate or just reset to a new game. I chose the latter.
Few misc bug fixes. All features work the same or better than before now.
Comparison using esp32_4MB_V4_S:
20241109_162450_1.mp4