-
Notifications
You must be signed in to change notification settings - Fork 201
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OutOfMemoryError reported by jenkins validation build after org.eclipse.ui.tests.UiTestSuite #2432
Comments
|
Maybe this could help understanding OOM errors on jenkins. See eclipse-platform#2432
Maybe this is related: https://gitlab.eclipse.org/eclipsefdn/helpdesk/-/issues/5056 |
If I look at this where we exchaust thread pools: I just wanted to mention that each thread requires a small amount of ram as well.... |
I would expect jenkins runs on a very small VM with very few (2-4?) cores, so thread pool size would be also very small and shouldn't be the root cause here. |
Hmm. I don't see OOM's on JDT tests which are known to be memory sensitive, and OOMs in UI tests seem to happen on / after specific testsuite, so it look like a code regression on our (platform / UI) side. |
This is a common misunderstanding with ForkJoinPools... If threads are blocked (e.g. the are waiting for I/O or waiting for other threads to join), then new threads are spwaned untill the configured concurrency level is fulfilled again, see here
The exception given here says "Thread limit exceeded replacing blocked worker" and the limit is by default 32767 (!) |
Cool, thanks. So one part of the problem in eclipse-platform/eclipse.platform#1592 is that the used pool is not limited in size (beside that the LocalFile.internalDelete() doesn't do what it supposed to do)? |
It is just a guess, if we see Thread explosion somehwere and memmory problems elsewhere, In general I/O bound tasks don'T play well with fork-join because the whole fork-join was introduced because of CPU bound (!) task usually suffer from context switch penalities. For I/O non blocking I/O is usually most faster if we asumme a shared device (e.g. harddisk / network / ...). |
https://ci.eclipse.org/platform/job/eclipse.platform.ui/job/PR-2433/1/console shows no major memory increase after multiple |
Maybe this could help understanding OOM errors on jenkins. Note: moved OpenCloseTest to the begin, just in case if that causes OOM. See eclipse-platform#2432
See #2432 and eclipse-platform/eclipse.platform#1592 This change shouldn't affect any test at all, it is not a code change.
Also exit E4Testable on OOM. This is supposed to workaround and to understand OOM errors on jenkins. See eclipse-platform#2432
Also exit E4Testable on OOM. This is supposed to workaround and to understand OOM errors on jenkins. See eclipse-platform#2432
Was it tried to revert eclipse-platform/eclipse.platform#1592 in a PR and see if it makes build more successful? |
Also exit E4Testable on OOM. This is supposed to workaround and to understand OOM errors on jenkins. See eclipse-platform#2432
As mentioned, I believe we only see OOM's in platform UI tests so far. The change eclipse-platform/eclipse.platform#1592 would also affect JDT tests, which seem not to be the case. |
Maybe that one of Platform UI tests do exactly track some scalability related to intense usage of the resource model operation and does lead to OOM with the exact purpose of showing a performance issue. |
Yes |
My thinking - if the PR fails with OOM itself and gives all the details for investigation there is no need to merge it as doing more would not help with OOM and investigation can continue on the PR itself till it actually improves the situation. |
Leak: Example ImportArchiveOperationTest alone opens a lot of windows (using openTestWindow().run) without ever closing them. I guess this is regression from 2ebbbe9 @akurtakov because the superclass UITestCase may have closed those windows using UITestCase.closeTestWindows |
I'll fix ImportArchiveOperationTest now. Funny, that I worked on this one while investigating the problem. |
tested that a revert close those windows. |
#2610 should reduce the frequency of the problem but we have been seeing OOMs before it too so this issue should stay open till we see no OOMs for couple of weeks at least |
Also exit E4Testable on OOM. This is supposed to workaround and to understand OOM errors on jenkins. See eclipse-platform#2432
Any chance to see which test leaks what? |
Also exit E4Testable on OOM. This is supposed to workaround and to understand OOM errors on jenkins. See eclipse-platform#2432
With the PR #2433 rebased on 78615b8 we see following picture: before 78615b8
OOM! after 78615b8
No OOM! |
Removing o.e.jdt.ui from the equation and reducing allocated heap by ~700MB is very suspicious and probably points to something huge not being cleaned up there. |
I guess automatic detection of JRE's on startup is one of the pain points, we had to explicitly disable that job in all JDT/PDE repos where Java tooling is present. |
How likely is that to be painful for a user with many JDKs! |
Not really, given a reasonable sized RAM or proper JVM arguments. 4GB RAM (or 1GB max heap default == 1/4 RAM) is surely not suitable anymore for a serious work. However with already 8GB max heap you don't see any problem at all also on large workspaces. |
When running locally and jdt plugins are available the test walked the whole contributed JDT items in the project explorer, which took much time and Heap memory eclipse-platform#2432
When running locally and jdt plugins are available the test walked the whole contributed JDT items in the project explorer, which took much time and Heap memory eclipse-platform#2432
When running locally and jdt plugins are available the test walked the whole contributed JDT items in the project explorer, which took much time and Heap memory #2432
I believe this can be closed now. |
See https://ci.eclipse.org/platform/job/eclipse.platform.ui/job/PR-2431/1/consoleFull that didn't changed code, just updated bundle versions, so the problem must be coming from either environment or code changes before.
The question is: what is causing that, and which heap space is actually used on jenkins?
The text was updated successfully, but these errors were encountered: