-
Notifications
You must be signed in to change notification settings - Fork 164
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Use fair-spill pool when spark.memory.offHeap.enabled=false
#1004
Conversation
I tried running benchmarks with this PR but ran into:
Perhaps we need to merge #988 first |
@Kontinuation Is this the general approach you were suggesting? |
I was able to get benchmarks running by allocating more memory to Comet. |
spark.executor.offHeap.enabled=false
spark.memory.offHeap.enabled=false
This is better than using a greedy memory pool. It makes spillable operators work correctly under memory pressure, especially when running sort-merge-join where multiple sort operators compete for resources. There are still some issues remaining unresolved. Each task may create multiple native plans and we still do not make them share the same memory pool. I'd like to share the experiments I've done to better sync with you on this topic. I did some experiments on my branch to try out various ways of using memory pools. There's a configuration spark.comet.exec.memoryPool = greedy This is the current mode when using native memory management. It could only run with
spark.comet.exec.memoryPool = fair_spill The same approach as this PR. Simply use FairSpillPool for per-plan memory pool. It could run with
But each task may create more than one native plans, which has its own memory pool. Here is an example task creating 2 native plans, and these 2 plans are running concurrently. Plan 1:
Plan 2:
spark.comet.exec.memoryPool = fair_spill_shared This approach allocates a FairSpillPool for all plans in the same task. For the above example, the sort-merge-join and the shuffle-write plans in the same task share the same memory pool. This strictly follows the conceptual model that comet won't exceed the I've added 2 additional JNI interfaces for creating a memory pool at the beginning of each task and releasing the memory pool at the end of each task. Actually this is not necessary. We can create and track the usage of per-task memory pool in the native code, all it needs is the task attempt id at native plan creation time. spark.comet.exec.memoryPool = fair_spill_global This approach uses a singleton FairSpillPool for all tasks in the same executor instance. I thought that it should be the optimal approach, but in practice it does not work well. It could only run with spark.comet.exec.memoryPool = greedy_global This approach uses a singleton GreedyMemoryPool for all tasks in the same executor instance. As expected, it does not work well. It could only run with So the conclusion is that |
Thanks for the detailed feedback @Kontinuation. I plan to resume work on this today/tomorrow. |
@Kontinuation Do you want to create a PR from your branch? I like the idea of having some different configurable options while we are experimenting with this |
Sure. I'll clean up the code and submit the PR tomorrow. |
Closing in favor of #1021 |
Which issue does this PR close?
Closes #996
Possibly depends on #988 being resolved first
Rationale for this change
This is an alternative to the approach in #1002
What changes are included in this PR?
How are these changes tested?