-
Notifications
You must be signed in to change notification settings - Fork 119
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Query Stats framework #2210
base: master
Are you sure you want to change the base?
Query Stats framework #2210
Conversation
Label error. Requires exactly 1 of: patch, minor, major. Found: enhancement |
def __sub__(self, other): | ||
return self._populate_stats(other._create_time) | ||
|
||
def _populate_stats(self, other_time): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Boilerplate code to beautify the output for now
Pending changes and improvement in later PRs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice job figuring out how to pass this stuff through Folly. Can I suggest merging a PR to start with that just introduces the custom Folly executors (with a suite of tests to show that the stats calculation works with both our task based APIs and normal Folly::.via
) and then we can figure out the other discussions after.
It would have been helpful if your PR description had explained your design.
|
||
// The first overload function will call the second one in folly. Have to override both as they are overloading | ||
// Called by the submitter when submitted to a executor | ||
void add(folly::Func func) override { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't quite follow why we need this kind of no-op override?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a C++ syntax. To override a parent function which is overloaded, it is needed to override all
@@ -174,11 +175,73 @@ inline auto get_default_num_cpus([[maybe_unused]] const std::string& cgroup_fold | |||
* 3/ Priority: How to assign priorities to task in order to treat the most pressing first. | |||
* 4/ Throttling: (similar to priority) how to absorb work spikes and apply memory backpressure | |||
*/ | |||
|
|||
class CustomIOThreadPoolExecutor : public folly::IOThreadPoolExecutor{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a use-case for the CRTP rather than one copy of the code for IO and one for CPU.
The name is a bit weird, CustomIOThreadPoolExecutor
could apply to any subclass of IOThreadPoolExecutor
regardless of its purpose. StatsContextIOThreadPoolExecutor
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point that 4 classes can be merged into 2. Updated
Though CRTP maybe not necessary to get the job done?
class IOSchedulerType : public folly::FutureExecutor<CustomIOThreadPoolExecutor>{ | ||
public: | ||
template<typename... Args> | ||
IOSchedulerType(Args&&... args) : folly::FutureExecutor<CustomIOThreadPoolExecutor>(std::forward<Args>(args)...){} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CRTP for this too I think
@@ -194,17 +257,27 @@ class TaskScheduler { | |||
auto task = std::forward<decltype(t)>(t); | |||
static_assert(std::is_base_of_v<BaseTask, std::decay_t<Task>>, "Only supports Task derived from BaseTask"); | |||
ARCTICDB_DEBUG(log::schedule(), "{} Submitting CPU task {}: {} of {}", uintptr_t(this), typeid(task).name(), cpu_exec_.getTaskQueueSize(), cpu_exec_.kDefaultMaxQueueSize); | |||
// Executor::Add will be called before below function |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this here too? Don't your custom executors handle this for us regardless of whether futures are scheduled with normal Folly APIs or our own task-based wrappers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here copy_instance
is needed, as the instance is needed to copied for each worker from caller.
The one in executor calls pass_instance
, as the instance just needs to be passed along the pipeline
@@ -194,17 +257,27 @@ class TaskScheduler { | |||
auto task = std::forward<decltype(t)>(t); | |||
static_assert(std::is_base_of_v<BaseTask, std::decay_t<Task>>, "Only supports Task derived from BaseTask"); | |||
ARCTICDB_DEBUG(log::schedule(), "{} Submitting CPU task {}: {} of {}", uintptr_t(this), typeid(task).name(), cpu_exec_.getTaskQueueSize(), cpu_exec_.kDefaultMaxQueueSize); | |||
// Executor::Add will be called before below function | |||
auto task_with_stat_query_wrap = [parent_instance = util::stats_query::StatsInstance::instance(), task = std::move(task)]() mutable{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand what task_with_stat_query_wrap
is supposed to mean
cpp/arcticdb/util/stats_query.hpp
Outdated
|
||
} | ||
|
||
#define GROUPABLE_STAT_NAME(x) stats_query_info##x |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand how I'm meant to use these APIs. A C++ unit test suite would help. How does the grouping work? Am I able to specify a grouping on a composite like, increment the counter for objects of this key type seen during this storage operation during this arcticdb operation
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added the explaination to the description of this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This must have a C++ unit test suite
stats = query_stats_tools_end - query_stats_tools_start | ||
""" | ||
Expected output; time values are not deterministic | ||
arcticdb_call stage key_type storage_op parallelized count time_count_20 time_count_510 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does time_count_{20,510}
mean?
Can be done later, let's have human readable key types in the output (like TABLE_DATA
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The titles will be more intuitive in the json format
""" | ||
Expected output; time values are not deterministic | ||
arcticdb_call stage key_type storage_op parallelized count time_count_20 time_count_510 | ||
0 list_streams None None None None None 0 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
list_streams
isn't a Python API method so won't mean anything to the user. How has it ended up in this output?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's manually named. Will be updated
query_stats_tools_end = StatsQueryTool() | ||
stats = query_stats_tools_end - query_stats_tools_start | ||
""" | ||
Expected output; time values are not deterministic |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we might need to rethink the idea of the stats output being a dataframe. It seems hard to answer the most important questions like "how long did I spend running list_symbols
in total", "how much of that was compaction" with the dataframe API. How would you get that information from the proposed dataframe output? We could always have utilities to transform some strongly typed stats output to a dataframe for the subset of measurements where that makes sense (eg these breakdowns of storage operations).
Also there are things like the dataframe API forces all the operations to share the same histogram buckets which probably isn't suitable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree. Will change to json
cpp/arcticdb/util/stats_query.hpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see why it has this, but this design has a kind of "no schema" approach to the stats, the schema is generated
dynamically based on the macro invocations. I think it may be better to have a defined schema for the stats. Just like Prometheus metric names get defined up front. I think the APIs to maintain the stats should be more like the Prometheus APIs to modify metrics.
I think your design is more similar to the APIs used by tracing libraries where you can add a hook anywhere you like, but this is quite different because we have to aggregate the metrics together.
This would add an extra chore when adding a new stat, but I think would make the whole thing clearer to people who don't use these APIs all the time (people may add new stats a couple of times a year so won't be familiar with this framework).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We spoke about this a lot and a stricter schema has its own big downsides, so OK sticking with this. The strict schema forces context to be passed between layers of the stack, which is painful
How are you calculating the histogram buckets? |
They are hardcoded 10ms buckets |
""" | ||
Sample output: | ||
{ | ||
"list_symbols": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still no way to see how long list_symbols
took, or how many uncompacted keys it saw?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops I have missed it in the conversion from df to map. Let me add it back
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also discussed including how many symbols are in the list_symbols
result set
cpp/arcticdb/util/query_stats.cpp
Outdated
} | ||
|
||
void QueryStats::register_new_query_stat_tool() { | ||
auto new_stat_tool_count = query_stat_tool_count.fetch_add(1, std::memory_order_relaxed) + 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really understand all these counters, what are these for? And the count isn't correct is it (adding one in this thread after the atomic fetch add)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really understand all these counters, what are these for?
The counter is for keeping track how many python object QueryStatsTool
is in-use:
QueryStats.register_new_query_stat_tool() |
If the counter > 0, query stats is ON
If counter reaches 0, query stats is OFF and stats are cleared.
And the count isn't correct is it (adding one in this thread after the atomic fetch add)
It fetches the value before adding. So without the + 1
at the end, that will be the old value
Test
def test_query_stats_tool_counter(s3_version_store_v1): |
cpp/arcticdb/util/query_stats.hpp
Outdated
#include <fmt/format.h> | ||
|
||
namespace arcticdb::util::query_stats { | ||
using StatsGroups = std::vector<std::shared_ptr<std::pair<std::string, std::string>>>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a type alias for the pair would help me read this (same anywhere you have complicated structures built out of std
types like this, eg StatsOutputFormat
)
cpp/arcticdb/util/query_stats.hpp
Outdated
std::atomic<int32_t> query_stat_tool_count = 0; | ||
std::mutex stats_mutex_; | ||
//TODO: Change to std::list<std::pair<StatsGroups, std::pair<std::string, std::variant<std::string, xxx>>> | ||
std::list<std::pair<StatsGroups, std::pair<std::string, std::string>>> stats; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pull out structs and type alias for these 🙏
@@ -141,6 +141,90 @@ TEST(Async, CollectWithThrow) { | |||
ARCTICDB_DEBUG(log::version(), "Collect returned"); | |||
} | |||
|
|||
TEST(Async, StatsQueryDemo) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Demo for how to add/use query stats
@@ -165,21 +169,53 @@ inline auto get_default_num_cpus([[maybe_unused]] const std::string& cgroup_fold | |||
#endif | |||
} | |||
|
|||
/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment is still valid, why have you removed it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a mistake
void add(folly::Func func, | ||
std::chrono::milliseconds expiration, | ||
folly::Func expireCallback) override { | ||
if (arcticdb::util::query_stats::QueryStats::instance().is_enabled_) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not a function for whether it's enabled?
std::chrono::milliseconds expiration, | ||
folly::Func expireCallback) override { | ||
if (arcticdb::util::query_stats::QueryStats::instance().is_enabled_) { | ||
auto func_with_stat_query_wrap = [layer = util::query_stats::QueryStats::instance().current_layer(), func = std::move(func)](auto&&... vars) mutable{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just call this wrapped_func
std::lock_guard lock{cpu_mutex_}; | ||
return cpu_exec_.addFuture(std::move(task)); | ||
if (arcticdb::util::query_stats::QueryStats::instance().is_enabled_) { | ||
auto task_with_stat_query_instance = [&parent_thread_local_var = util::query_stats::QueryStats::instance().thread_local_var_, task = std::move(task)]() mutable{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Call this wrapped_task
|
||
|
||
using namespace arcticdb::util::query_stats; | ||
auto query_stats_module = tools.def_submodule("QueryStats", "Stats query functionality"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Query stats" not "stats query"
|
||
|
||
using namespace arcticdb::util::query_stats; | ||
auto query_stats_module = tools.def_submodule("QueryStats", "Stats query functionality"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
QueryStats
isn't a conventional name for a Python module, I would have expected query_stats
?
|
||
using namespace arcticdb::util::query_stats; | ||
auto query_stats_module = tools.def_submodule("QueryStats", "Stats query functionality"); | ||
py::enum_<StatsGroupName>(query_stats_module, "StatsGroupName") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some ideas about naming given that these are all in a module called QueryStats
anyway,
StatsGroupName -> GroupName
StatsName -> StatisticName
StatsGroupLayer -> GroupingLevel
current_layer -> current_level
root_layers -> root_levels (not sure why it isn't just root_level())
query_stats_module.def("current_layer", []() { | ||
return QueryStats::instance().current_layer(); | ||
}); | ||
query_stats_module.def("root_layers", []() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm slightly surprised you're exposing these levels to the Python layer, but not a big deal
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea I want to move the formatting to python level as much as I can.
|
||
def test_query_stats(s3_version_store_v1, clear_query_stats): | ||
s3_version_store_v1.write("a", 1) | ||
QueryStatsTool.enable() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can the API just be free functions, I'm not sure why we need an object QueryStatsTool
enable_query_stats()
disable_query_stats()
|
||
next_layer_map = next_layer_maps[group_idx] | ||
|
||
# top level |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you should have a check in this function that the arcticdb_call
is indeed at the top level of the stats object we're processing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. But I rather add the checking at the C++ layer
* so the stats logged in folly threads will be aggregated to the master map | ||
* (Checking will be added after all log entries are added) | ||
* 2. All folly tasks must be submitted through the TaskScheduler::submit_cpu_task/submit_io_task | ||
* 3. All folly tasks must complete ("collected") before last StatsGroup object is destroyed in the call stack |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if a task fails? Should add testing in your C++ test for that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will still work. All I need is the task to complete, failed or not, to avoid race condition
Will update the C++ test for that
cpp/arcticdb/util/query_stats.hpp
Outdated
* When created, it adds a new layer and when destroyed, it restores the previous layer state | ||
* | ||
* Note: | ||
* To make the query stats model works, there are two requirements: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
work not works
three
|
||
} | ||
|
||
#define STATS_GROUP_VAR_NAME(x) query_stats_info##x |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still don't think there's any need for these to be macros rather than normal functions? I get that you'd have to hold the StatsGroup
alive for the RAII to work, but that should be fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea I am referencing existing implementation, e.g. ARCTICDB_SAMPLE
.
Declaring a variable is a bit weird for logging. And folding the ON/OFF into StatsGroup
class's constructor/destructor is also weird IMO
And I want to make adding groupable stat and non-groupable stat unifrom as well, from the user's perspective
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does list_symbols
release the GIL? As soon as you instrument a function that does, should add tests to check how this all behaves with Python multi-threading
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't release the GIL.
I simulate the python multi-threading in the C++ test Async.StatsQueryDemo
* 1. All calls from python level must mark "QUERY_STATS_ADD_GROUP...." at least once in the call stack | ||
* so the stats logged in folly threads will be aggregated to the master map | ||
* (Checking will be added after all log entries are added) | ||
* 2. All folly tasks must be submitted through the TaskScheduler::submit_cpu_task/submit_io_task |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could end up needing quite a lot of refactoring
}; | ||
|
||
struct GroupingLevel { | ||
std::array<int64_t, 3> stats_ = {0}; // sizeof(StatsName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the advantage of this over having:
int64_t result_count;
int64_t total_time_ms;
int64_t count;
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the current way is slightly neater for the conversion of from the internal map to python json format as it allows the copy of the values by iterating the array (no boiler plate code) and the enum that comes with the current design is handy for the naming of node in json
|
||
struct GroupingLevel { | ||
std::array<int64_t, 3> stats_ = {0}; // sizeof(StatsName) | ||
std::array<std::map<std::string, std::shared_ptr<GroupingLevel>>, 3> next_level_maps_; // sizeof(GroupName) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question as above.
It also looks like the maps don't need to be ordered, and are never removed from, only added to and cleared, in which case ankerl::unordered_dense
will be more performant
|
||
QueryStats& QueryStats::instance() { | ||
static QueryStats instance; | ||
return instance; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://github.com/man-group/ArcticDB/blob/master/cpp/arcticdb/util/configs_map.cpp
Better pattern for singleton instantiation
} | ||
|
||
void QueryStats::reset_stats() { | ||
check(!async::TaskScheduler::instance()->tasks_pending(), "Folly tasks are still running"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will throw an exception if any tasks are pending? How do you know some other Python thread isn't running ArcticDB tasks when you call this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently query stats is process-global. Same as root_levels_
. So it's just fundamentally wrong IMO if the stats are being cleared when some tasks are still running
|
||
StatsGroup::StatsGroup(bool log_time, GroupName col, const std::string& value) : | ||
prev_level_(QueryStats::instance().current_level()), | ||
start_(std::chrono::high_resolution_clock::now()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be an alias for std::chrono::system_clock
, which can go backwards in time due to daylight savings or NTP updates. Should use steady_clock
// current_level_ != nullptr && root_level_ != nullptr -> stats has been setup; Nothing to do | ||
// current_level_ == nullptr && root_level_ == nullptr -> clean slate; Need to setup | ||
// current_level_ != nullptr && root_level_ == nullptr -> Something is off | ||
// current_level_ == nullptr && root_level_ != nullptr -> Something is off |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use util::check
for the "Something is off" cases
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's checked below (line 22)
}); | ||
query_stats_module.def("is_enabled", []() { | ||
return QueryStats::instance().is_enabled(); | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of these methods don't seem like they are user facing? If they are just used for testing, then give them an underscore prefix
check( | ||
col_value != util::query_stats::GroupName::key_type || is_value_key_type, | ||
"key type query stats needs to have key_type value" | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can this be a static assertion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good reminder! I have thought about it but didn't implement it
@@ -443,12 +444,13 @@ bool do_iterate_type_impl( | |||
|
|||
auto continuation_token = std::optional<std::string>(); | |||
do { | |||
QUERY_STATS_ADD_GROUP_WITH_TIME(storage_ops, "ListObjectsV2") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be scoped just around the s3_client.list_objects
call? Otherwise it will also be timing all the code down to line 470 as well?
std::lock_guard lock{cpu_mutex_}; | ||
return cpu_exec_.addFuture(std::move(task)); | ||
if (util::query_stats::QueryStats::instance().is_enabled()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check should be done outside the mutex so we are not creating extra contention for people who are not using the query stats
std::lock_guard lock{io_mutex_}; | ||
return io_exec_.addFuture(std::move(task)); | ||
if (util::query_stats::QueryStats::instance().is_enabled()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As above
@@ -194,19 +238,53 @@ class TaskScheduler { | |||
auto task = std::forward<decltype(t)>(t); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be a good idea to put a timings marker here, so that we can observe the differential timings in adding a task when using or not using query stats.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. The query stats ON performance will be addressed a bit later in a separate ticket
* 4/ Throttling: (similar to priority) how to absorb work spikes and apply memory backpressure | ||
*/ | ||
template <typename T> | ||
class ExecutorWithStatsInstance : public T{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have mixed feeling about attaching the stats to the task scheduler. If you look at the existing timings framework that allows a dump to stdout or a Remotery connection, which is covering many of the same areas that this one is, it is timing logical stages in the execution of a query, rather than the activities of the scheduler. That's because the way that things are scheduled is an implementation detail that can and does change - some logical activities are composed of multiple scheduling steps, some only one, and some things that we might want to time aren't scheduled at all but are executed inline in the main thread. I would hate to see a day where we were asynchronously scheduling things that didn't benefit from it just because we wanted to time them.
The scheduler is also the place where it's possible to have the worst impact in terms of imposing cost on people who don't want to use the query stats. As long as the access to the global variable is moved outside the scheduling mutex, this doesn't look too bad, however we'll need to be sure - we can use the existing timings mechanism to see the overhead of the additional work.
In this case it doesn't matter particularly, since some additional cost is to be expected when the query stats is enabled, but traversing a tree of shared_ptr is not an inherently fast design - I would not want to see it in a feature that was going to be enabled by default. I think it stems in part from the fact that the framework is very open-ended, perhaps more so that we need. Ultimately in a database there are a limited number of things that we want to time, so a set number of (perhaps) thread-cached counters that could be manually added to as required would probably suffice, and be more performant.
In terms of the thread-caching, the specific async framework we use will schedule work almost (from the point of view of the stats framework) randomly across different threads, as it maintains a thread-pool and dispatched tasks to an available or lightly-loaded thread - is this something that you've taken into account?
Reference Issues/PRs
https://man312219.monday.com/boards/7852509418/pulses/8297768017
What does this implement or fix?
Any other comments?
For python layer, the change is minimum, only serve the purpose of outputing something that can be verified in the test
The missing functions will be handled in later PRs.
Checklist
Checklist for code changes...