Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discrete Control performance improvements based on AGV model #1529

Merged
merged 38 commits into from
Jun 12, 2024

Conversation

SouthEndMusic
Copy link
Collaborator

@SouthEndMusic SouthEndMusic commented Jun 4, 2024

Fixes #1512
Fixes #1539

@evetion
Copy link
Member

evetion commented Jun 4, 2024

Could you provide some timings/graphs on the performance of the model, how you found these bottlenecks and how this addresses them?

@SouthEndMusic SouthEndMusic marked this pull request as draft June 4, 2024 18:32
@SouthEndMusic
Copy link
Collaborator Author

SouthEndMusic commented Jun 4, 2024

@evetion this could be the topic of the next knowledge sharing, but here's a quick view of the profiles of the AGV model (4 simulated days).

On the current main:
profile_main

On this branch:
profile_now

@SouthEndMusic
Copy link
Collaborator Author

SouthEndMusic commented Jun 5, 2024

Here's a more elaborate breakdown of the changes to discrete control in this PR.

Avoiding allocation

This is a straightforward one, there was memory allocation each time a truth state (a string of 'T' and 'F' characters) was determined:

# Get the truth state for this discrete_control node
truth_values = cat(
[
[ifelse(b, "T", "F") for b in discrete_control.condition_value[i]] for
i in where_node_id
]...;
dims = 1,
)
truth_state = join(truth_values, "")

To get around this, I pre-allocate memory for the longest possible truth state and use that every time a truth state is determined:

Ribasim/core/src/read.jl

Lines 757 to 763 in 77a2f0d

max_truth_state_length = 0
for id in unique(node_id)
where_node_id = searchsorted(node_id, id)
truth_state_length = sum(length(condition_value[i]) for i in where_node_id)
max_truth_state_length = max(max_truth_state_length, truth_state_length)
end
truth_state = zeros(Bool, max_truth_state_length)

I also changed the truth states from strings to Boolean vectors, which is possible now that we don't have the 'U' and 'D' truth values anymore. The input and output are still written as strings of 'T' and 'F'.

Type stability

This is a more interesting one. This part of the code was very detrimental to performance:

function set_control_params!(p::Parameters, node_id::NodeID, control_state::String)
node = getfield(p, p.graph[node_id].type)
idx = searchsortedfirst(node.node_id, node_id)
new_state = node.control_mapping[(node_id, control_state)]
for (field, value) in zip(keys(new_state), new_state)
if !ismissing(value)
vec = get_tmp(getfield(node, field), 0)
vec[idx] = value
end
# Set new fractional flow fractions in allocation problem
if is_active(p.allocation) && node isa FractionalFlow && field == :fraction
set_fractional_flow_in_allocation!(p, node_id, value)
end
end
end

That is because this code is type unstable; a priori the type of node is unknown, new_state is a tuple of arbitrary contents, and so the type of vec is unknown too.

To get around this, I:

  1. Created a ParameterUpdate type, which contains the data for a control state:

"""
ParameterUpdate is an object for storing the data for the
parameter change associated with a specific control state
used by DiscreteControl.
"""
struct ParameterUpdate
node_type::NodeType.T
node_idx::Int
active::Bool
flow_rate_scalar::Float64
flow_rate_itp::ScalarInterpolation
min_crest_level::Float64
resistance::Float64
manning_n::Float64
fraction::Float64
table::ScalarInterpolation
target::ScalarInterpolation
proportional::ScalarInterpolation
integral::ScalarInterpolation
derivative::ScalarInterpolation
pid_params::VectorInterpolation
end

  1. Combined all control mappings from the different node types into one:

Ribasim/core/src/read.jl

Lines 1210 to 1215 in 77a2f0d

for node_type in propertynames(p)
node = getfield(p, node_type)
if hasproperty(node, :control_mapping)
merge!(discrete_control.control_mapping, node.control_mapping)
end
end

  1. Created parameter update functions for all node types that support being controlled, which all get called when a control state change occurs but only the one for the right node type does something:

function update_parameters!(pump::Pump, parameter_update::ParameterUpdate)::Nothing
if parameter_update.node_type != NodeType.Pump
return nothing
end
(; node_idx, active, flow_rate_scalar) = parameter_update
pump.active[node_idx] = active
if !isnan(flow_rate_scalar)
get_tmp(pump.flow_rate, 0)[node_idx] = clamp(
flow_rate_scalar,
pump.min_flow_rate[node_idx],
pump.max_flow_rate[node_idx],
)
end
return nothing
end
function update_parameters!(outlet::Outlet, parameter_update::ParameterUpdate)::Nothing
if parameter_update.node_type != NodeType.Outlet
return nothing
end
(; node_idx, active, flow_rate_scalar) = parameter_update
outlet.active[node_idx] = active
if !isnan(flow_rate_scalar)
get_tmp(outlet.flow_rate, 0)[node_idx] = clamp(
flow_rate_scalar,
outlet.min_flow_rate[node_idx],
outlet.max_flow_rate[node_idx],
)
end
return nothing
end
function update_parameters!(
tabulated_rating_curve::TabulatedRatingCurve,
parameter_update::ParameterUpdate,
)::Nothing
if parameter_update.node_type != NodeType.TabulatedRatingCurve
return nothing
end
(; node_idx, active, table) = parameter_update
tabulated_rating_curve.active[node_idx] = active
if !isempty(table.t)
tabulated_rating_curve.tables[node_idx] = table
end
return nothing
end
function update_parameters!(
fractional_flow::FractionalFlow,
parameter_update::ParameterUpdate,
p::Parameters,
)::Nothing
if parameter_update.node_type != NodeType.FractionalFlow
return nothing
end
(; node_idx, fraction) = parameter_update
fractional_flow.fraction[node_idx] = fraction
if is_active(p.allocation)
set_fractional_flow_in_allocation!(p, fractional_flow.node_id[node_idx], fraction)
end
return nothing
end
function update_parameters!(
pid_control::PidControl,
parameter_update::ParameterUpdate,
)::Nothing
if parameter_update.node_type != NodeType.PidControl
return nothing
end
(; node_idx, active, target, pid_params) = parameter_update
pid_control.active[node_idx] = active
if !isempty(target.t)
pid_control.target[node_idx] = target
end
if !isempty(pid_params.t)
pid_control.pid_params[node_idx] = pid_params
end
return nothing
end
function update_parameters!(
linear_resistance::LinearResistance,
parameter_update::ParameterUpdate,
)::Nothing
if parameter_update.node_type != NodeType.LinearResistance
return nothing
end
(; node_idx, active, resistance) = parameter_update
linear_resistance.active[node_idx] = active
if !isnan(resistance)
linear_resistance.resistance[node_idx] = resistance
end
return nothing
end
function update_parameters!(
manning_resistance::ManningResistance,
parameter_update::ParameterUpdate,
)::Nothing
if parameter_update.node_type != NodeType.ManningResistance
return nothing
end
(; node_idx, active, manning_n) = parameter_update
manning_resistance.active[node_idx] = active
if !isnan(manning_n)
manning_resistance.manning_n[node_idx] = manning_n
end
return nothing
end
function set_control_params!(p::Parameters, node_id::NodeID, control_state::String)::Nothing
parameter_update = p.discrete_control.control_mapping[(node_id, control_state)]
# Only the method corresponding to the type of the node_id actually updates parameters,
# but this way there is no runtime dispatch on node type
update_parameters!(p.pump, parameter_update)
update_parameters!(p.outlet, parameter_update)
update_parameters!(p.tabulated_rating_curve, parameter_update)
update_parameters!(p.fractional_flow, parameter_update, p)
update_parameters!(p.pid_control, parameter_update)
update_parameters!(p.linear_resistance, parameter_update)
update_parameters!(p.manning_resistance, parameter_update)
return nothing
end

Notes on complexity and maintainability

The code above makes the discrete control code more explicit, which I think makes it easier to understand but also increases maintenance load. For instance, if we want to support discrete control for a new variable or node type, this needs to be added to ParameterUpdate and the corresponding update_parameters! method. The parameter_update! methods do allow for node/variable specific functionality regarding updating variables.

Maybe the discrete control code should also get its own script (again).

@SouthEndMusic SouthEndMusic marked this pull request as ready for review June 5, 2024 10:28
@gijsber
Copy link
Contributor

gijsber commented Jun 5, 2024

nice explanation. thanks!

@evetion
Copy link
Member

evetion commented Jun 5, 2024

Funny thing to play with, slightly relevant:

julia> macro b_str(v); collect(v).=='T'; end
@b_str (macro with 1 method)

julia> b"TTF"
3-element BitVector:
 1
 1
 0

edit: Note that b_str already exists in Base, so would be good to choose another prefix:

help?> @b_str
  @b_str

  Create an immutable byte (UInt8) vector using string syntax.

@SouthEndMusic SouthEndMusic requested a review from evetion June 5, 2024 12:40
Copy link
Member

@evetion evetion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice to see another performance increase! What are the absolute runtimes/relative part of control of/in the overall runtime?

Seems like some good changes, but I wonder what you think of your solution using the ParameterUpdate struct. I see two red flags in its design, that you have to check for nan, and the check for the correct type. Let me know if you want help on refactoring it further.

As an experiment, seeing that this is a documentation sprint as well, might I suggest to try to go overboard in adding inline comments/docstrings literally everywhere? Just to see whether that results in something we're happy with. I will take it on me to review and edit those comments myself later on.

core/src/util.jl Outdated Show resolved Hide resolved
core/src/util.jl Outdated Show resolved Hide resolved
core/src/callback.jl Outdated Show resolved Hide resolved
core/src/callback.jl Outdated Show resolved Hide resolved
Comment on lines 300 to 308
n_greater_than = length(truth_values_variable)
discrete_control.truth_state[truth_value_idx:(truth_value_idx + n_greater_than - 1)] .=
truth_values_variable
variable_idx += 1
truth_value_idx += n_greater_than
end

# The maximum truth state length is possibly shorter than this truth state
truth_state = view(discrete_control.truth_state, 1:(truth_value_idx - 1))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are certain assumptions here on the possible interplays between the different input tables, but I would be really helped to by some inline documentation of such tables.

Both the comment # Build up the truth state from the truth values per (compound) variable per greater than and # The maximum truth state length is possibly shorter than this truth state don't help me much here.

Random thought; we could split such functionality into a function and have a docstring including a doctest?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

doctest?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, providing an example (and testing it) as part of the docstring https://documenter.juliadocs.org/stable/man/doctests/.

core/src/callback.jl Outdated Show resolved Hide resolved
core/src/callback.jl Outdated Show resolved Hide resolved
core/src/read.jl Outdated Show resolved Hide resolved
core/test/control_test.jl Show resolved Hide resolved
core/src/callback.jl Outdated Show resolved Hide resolved
core/src/callback.jl Outdated Show resolved Hide resolved
@evetion
Copy link
Member

evetion commented Jun 7, 2024

I have refactored several smaller things several times, but have not found significant improvements, apart from the commit I did. Things that do not pan out:

  • Threading in our control loop(s)
  • NodeID to Int32 for the control state dictionary lookup
  • Symbol instead of String for variable for lookups and comparisons
  • Parametric struct with NamedTuple field. These caused some runtime dispatch, but were not really slower, but it still required most of the boilerplate in this PR. Not sure if the extra complexity of such a struct is worth it, so I kept it as is.

This PR is 2-3x faster than the current main, mostly because we went from 35M to 2M allocations. Latest flamegraph with two observations:
Screenshot 2024-06-07 at 08 42 47

Control takes roughly a third of the runtime (originally 70%). Of that time, a third is spent in purely looping over all control conditions (discrete_control_condition), of which only a small part is spent in looking up values(!). Might be improved by a different memory structure. The other highlight is the get function, spending 12% of the control time retrieving the new control state from the dict with a complex key.

Wrapping up, getting this faster requires a more significant refactor. Besides, I believe the current code needs a refactor, as it introduced a lot of boilerplate, with repetition and fields also existing elsewhere. This is discussed partially in #1539. Depending on the priority of getting the AGV model faster, we can either merge this, and promise to follow up soon, or keep this open until we refactor.

@evetion evetion requested a review from visr June 7, 2024 14:43
@evetion
Copy link
Member

evetion commented Jun 10, 2024

Based on the discussion of today, let's refactor while we can! We see several steps, each one slightly more complex than the last, would be good to implement them one by one, and see how performance/complexity behaves. It's fine to stop after step 1 or 2 if this takes too long for too little benefit.

  1. Refactor each Node to contain change control_mapping::Dict{Tuple{NodeID, String}, ParameterUpdate} to a control_mapping::Dict{Tuple{NodeID, String}, NamedTuple{specific to nodetype}. This will remove the ParameterUpdate struct, and would simplify several discrete_control_parameter_update to a single discrete_control_parameter_update(node::AbstractParameterNode, i) (and one specific to fractionalflow).
  2. In discrete_control_affect!, we can shortcut before the dict lookup by checking whether the truth_state is identical to the current truth state?
  3. Think of a way to reduce the ifelses of set_control_params!. Sorry, we didn't discuss this. I feel that each NodeID should have a direct access to its Node struct (@visr a structview?), instead of the weird lookups we do now.
  4. Remove the need for Dict(s) like logic_mapping and control_mapping. This requires refactoring in how we read the initial structure, but could be akin to how previous perfomance by caching. For example, the control state, now a string and part of the dict that gives you a state could be replaced by a vector of states, and storing the index in that vector instead of the string. The same thing could be done on a NodeID level, the control_mapping could be a Vector{Vector{ParameterUpdate}}, aligning with the NodeIDs in the node_id field. On read we could store the logical index to the NodeID instead of the NodeID itself.
  5. Test changing the memory layout of discrete_control_condition! from SoA to AoS. SoA should be faster in single variable operations, but since we access the whole struct all the time, might not be. This might be informative on the topic: https://viralinstruction.com/posts/hardware/.

@SouthEndMusic
Copy link
Collaborator Author

SouthEndMusic commented Jun 10, 2024

Status update

Current profile of running the AGV model:
another_profile

Zoomed in on the discrete control section:
another_profile_zoomed

So there is some runtime dispatch at where now the generic parameter update loop is:

function discrete_control_parameter_update!(
node::AbstractParameterNode,
node_id::NodeID,
control_state::String,
)::Nothing
new_state = node.control_mapping[(node_id, control_state)]
(; node_idx) = new_state
for (field, value) in zip(keys(new_state), new_state)
if field == :node_idx
continue
end
vec = get_tmp(getfield(node, field), 0)
vec[node_idx] = value
end
end

but it is not significant (note that we do not know how this behaves in general, the AGV only/mainly has control of pumps). Running the 4 day simulation now takes ~3.3 seconds.

The latest commits got rid of quite some complexity/nastiness:

  • The heap of parameters that was the ParameterUpdate struct
  • The discrete_control_parameter_update! method for each controllable node type type
  • The checking for NaN in the control state parameters; now every control state contains a value for every controllable parameter
  • There were also some changes to TabulatedRatingCurve and PidControl to make the column names of their respective static tables align better with the field names of their structs, which was needed to support the generic parameter update code. This also got rid of some complexity that combined the P,I,D coefficients unnecessarily into one object.

There are still some lookups/loops that can probably be sped up. Also it would be nice if this can be implemented in a more generic way without significant loss of performance:

function set_control_params!(p::Parameters, node_id::NodeID, control_state::String)::Nothing
# Check node type here to avoid runtime dispatch on the node type
if node_id.type == NodeType.Pump
discrete_control_parameter_update!(p.pump, node_id, control_state)
elseif node_id.type == NodeType.Outlet
discrete_control_parameter_update!(p.outlet, node_id, control_state)
elseif node_id.type == NodeType.TabulatedRatingCurve
discrete_control_parameter_update!(p.tabulated_rating_curve, node_id, control_state)
elseif node_id.type == NodeType.FractionalFlow
discrete_control_parameter_update!(p.fractional_flow, node_id, control_state, p)
elseif node_id.type == NodeType.PidControl
discrete_control_parameter_update!(p.pid_control, node_id, control_state)
elseif node_id.type == NodeType.LinearResistance
discrete_control_parameter_update!(p.linear_resistance, node_id, control_state)
elseif node_id.type == NodeType.ManningResistance
discrete_control_parameter_update!(p.manning_resistance, node_id, control_state)
end
return nothing
end

To me this looks like an interesting use case for a custom macro, but I haven't been able to figure that out.

@SouthEndMusic
Copy link
Collaborator Author

@evetion I also now implemented your point 2, now I am down to a runtime of ~2.7 s. I'm now also wondering whether I need this whole part at all:

# We need to build up the truth state for this DiscreteControl node by:
# - Finding all the (compound) variables this DiscreteControde node listens to
# - Concatenate in preallocated memory the truth values for all conditions per variable
#
# There can be multiple truth values per variable because there can be multiple 'greater_than's
# per variable which each define a condition. So the final control state is, if the discrete control node
# listens to variables with indices n, ..., N:
# [
# variable_n > greater_than_n_1, ..., variable_n > greater_than_n_a,
# variable_{n+1} > greater_than_{n+1}_1, ..., variable_{n+1} > greater_than_{n+1}_b,
# ...
# variable_N > greater_than_N_1, ..., variable_N > greater_than_N_c
# ]
found_change = false
truth_value_idx = 1
for (variable_idx, node_id) in enumerate(discrete_control.node_id)
if node_id < discrete_control_node_id
continue
elseif node_id == discrete_control_node_id
truth_values_variable = discrete_control.condition_value[variable_idx]
n_greater_than = length(truth_values_variable)
range_variable = truth_value_idx:(truth_value_idx + n_greater_than - 1)
truth_values_variable_old = view(truth_state, range_variable)
if !found_change
for i in eachindex(truth_values_variable)
if truth_values_variable_old[i] !== truth_values_variable[i]
found_change = true
break
end
end
end
truth_values_variable_old .= truth_values_variable
variable_idx += 1
truth_value_idx += n_greater_than
else
break
end
end

That is, why have separate truth values per variable and then copy them to the full truth state. I think this was relevant when we considered that variables could be listened to by multiple discrete control nodes, but now (compound) variables are discrete control node specific so we don't need that anymore. Probably the change in truth state can already be detected in discrete_control_condition!.

@SouthEndMusic
Copy link
Collaborator Author

Tests now probably fail because a control state change check should be done at the beginning of the simulation regardless of truth state change.

Some thoughts on refactoring the logic_mapping and control_mapping dicts: split them out by node ID so that this doesn't have to be part of the keys. For logic_mapping: the boolean vector input can be converted to an integer index via binary. This index would be for a sparse vector of strings (because not every possible truth state must have a control state). That's probably still slow, so the strings should be converted to an index too, which has to be coordinated properly between the logic mapping and control mappings. The string control state name must still be available for the output data.

@SouthEndMusic
Copy link
Collaborator Author

Another big step in the refactor: the functions discrete_control_affect! and discrete_control_condition! have been merged into one function and one loop, which also got rid of the need to copy groups of truth values. Data for a single compound variable has been clustered in the CompoundVariable struct, which makes clear which things belong together and I think makes the code a bit more readable. Current runtime of the AGV model is ~2.4s.

I propose to leave it here for now regarding this PR and the refactoring of discrete control. One thing left to do is a round of updating inline documentation.

@SouthEndMusic SouthEndMusic requested a review from evetion June 11, 2024 14:43
Copy link
Member

@evetion evetion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well done, it's more readable now. Besides, you coded even more subtle performance improvements.

consisting of 'truth_values' for a DiscreteControl node. This truth_state then corresponds with a 'control_state',
and nodes controlled by a DiscreteControl node have parameter values associated with this control_state.
Apply the discrete control logic. There's somewhat of a complex structure:
- Each DiscreteControl node can have one ore multiple compound variables it listens to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Each DiscreteControl node can have one ore multiple compound variables it listens to
- Each DiscreteControl node can have one or multiple compound variables it listens to

@SouthEndMusic SouthEndMusic merged commit edd7f87 into main Jun 12, 2024
8 checks passed
@SouthEndMusic SouthEndMusic deleted the performance_AGV branch June 12, 2024 10:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Fully defined control states Performance assessment/improvement based on AGV model
3 participants