-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature to stop accepting new tasks on existing threads #42709
Feature to stop accepting new tasks on existing threads #42709
Conversation
When running in multi-threaded mode, avoid picking up tasks from shared heaps when you're thread 1 (tid == 0).
When single-threaded, enq_work should place the work in the thread tid=1 workqueue rather than randomly trying to avoid first thread.
Threads will need to call jl_stop_accepting_new_tasks_on_this_thread method in order to stop taking new work from the shared queues. This should be then used by our own code to make it so on the specific thread that runs rai-server and schedules all the background tasks.
It'd be nice to have a more compositional API. If a library function internally uses Also, this API cannot be used in a library since there's no way to know that other libraries are calling this function. So, I think it makes more sense to allocate a dedicated thread pool on startup so that the latency-critical part of the application can use it. It'll remove the cost of this feature when not used. IMHO, I think API like this needs to address the problem of priority inversion as I mentioned in #41586 (comment) As an aside, I think it's straightforward to avoid a lock if we don't allow making the primary thread to be non-accepting. Also, it seems to me that the current implementation has a data race. Since the flag is read outside the lock, you'd need to use atomic store/load on the flag, even if it's protected by the lock. |
I like the general direction, though need to review in more detail later. I think you can add this flag to the ptls state struct in julia_threads.h. |
Yeah, this is basically intended for setup-once situations where operators of a long-running server will mark certain threads as "high priority" and schedule the latency-sensitive work on them. It should not really be used by libraries because that could cause all sorts of problems. This is somewhat analogous to having two thread-pools, but less integrated with the built-in scheduler. We do actually need this kind of behavior and have been running our server with this patch for quite some time now and it did help with a lot of responsiveness issues we've been experiencing prior to this. Ultimately, in a complex codebase where some of the code is user-supplied or generated makes proper cooperation between tasks infeasible (not to mention that there don't seem to be any tools for identifying tasks that are not cooperative, making debugging of these problems intractable). We might achieve this behavior with threadpools but it seems that the conversation in the PR that implements that feature is hitting some ideological design disagreements so I'm not sure what will come out of it. I think that the architecturally cleanest solution here would be to have an API that would allow us to spawn a task in a dedicated thread that will only exist while the task is running and will be shut down when the task completes. This way we could spin off lightweight threads running critical work and ensure that they won't be blocked by bulk workloads (those not latency sensitive but potentially large in volume and potentially not yielding). This does imply that we might oversubscribe the physical machine, but this is an acceptable tradeoff and we may still deal with that by scaling down the regular thread count for the bulk workloads. I have not really thought much about what it would take to implement this thread-dedicated-for-a-task and whether it would integrate cleanly with the rest of the system. In particular, it seems like dynamically adding and removing threads into the system may be quite a challenge.
I'm not sure whether the assumption that all transitive tasks need to stay on the dedicated thread. Our critical tasks are generally pretty simple and they don't do much concurrency so we are probably assuming that this is best-effort and that if something calls As a counter-example, we run http server on the high priority thread, ensuring that we can always accept incoming http connections. For long-running user requests, we call
|
1a719fb
to
94d6407
Compare
Why the close? |
This PR is against the |
Ah, it needs to be re-created against master then, since it cannot be reopened. |
I just created a dummy |
Thanks for managing that, @tkf! |
(I just realized that my hack made it impossible to review the patch. The diff is easy to see at https://github.com/JuliaLang/julia/compare/1b93d53fc4bb59350ada898038ed4de2994cce33..e5cce258ea722884cb14d018213cef60da48b79c)
This is a pretty good way to summarize my complaints :) But, from the practical point of view, I think this PR might be better than #42302 as it's very (and surprisingly) minimal. Maybe that's why Jameson liked the approach.
Booting up Julia's runtime environment (ptls) for a thread does not seem like a "lightweight" thing to do (at least intuitively). I'm also not entirely sure if it's easy to tear it down in the middle of the process lifetime. I think it would be "easier" to reuse the thread pool as I just sketched in #43649.
It's easy to set up a To put it in another way, I think we better stick with the "black box rule" (i.e., it's an implementation detail if a function spawns tasks or not) to ensure composability, even with the tasks on dedicated worker threads. If every function call is a potential cause of priority inversion, I think it'd be very hard to use library functions. |
This still needs to be rebased against master, and the flag moved into ptls |
I don't think we have any intention of merging this PR as-is; i think it was opened mainly for discussion purposes, and as an FYI that we are currently applying this patch in our julia builds in production. As long as the problem is addressed in some way (e.g. by the ongoing thread pools work), i don't think we'd need this PR anymore. I'm going to mark it DO-NOT-MERGE for clarity. |
Threadpools is a slight generalization of this idea (allowing multiple threads to be prevented from examining the queue), but I think this is still something we can do in the meantime, and likely will be able to continue to support long-term. Other than finishing some nits, I think we can merge (and backport) |
Hmm... Isn't #42302 close to ready? (I should probably look at the implementation a bit more, though.) I think expanding API surface of the scheduler makes it hard to extend and enable a more flexible approach like #44064. If we are planning to merge #42302, I think it's better to only add #42302 and not in combination with this PR. Or, can we at least put it in Experimental, so that it can be eventually removed once #42302 is in? |
This feature allows operators to stop scheduling of new tasks on specific threads which will make those threads high priority and only handle either explicitly scheduled tasks or tasks that are already running on those high priority threads.
This feature can be used to guarantee responsiveness on certain critical low-latency tasks by isolating them from the bulk of the julia workload that will run on other threads. We have observed that this eliminates most of the scheduling issues and slowdowns due to tasks occupying threads for excessive amounts of time (this is particularly problematic with sticky tasks in 1.6 where single long-running task that gets scheduled on the same thread as critical periodic task can cause major delays).
That said, low-latency is still subject to occasional hiccup as a result of stop-the-world behaviors that happen during garbage collection and compilation.