Add Tarjan SCC network cleaner algorithm #3650
Draft
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Adds a new network cleaner based on Tarjan's strongly connected components algorithm to identify strongly connect clusters.
https://en.wikipedia.org/wiki/Tarjan%27s_strongly_connected_components_algorithm
The reason is that it is considerably more performant than the default network cleaner which allows to incorporate a simple mechanism to create strongly connected clusters with the recently introduced multi-step and stackable turn restrictions (aka disallowed next link sequences https://github.com/matsim-org/matsim-code-examples/wiki/turn-restrictions)
The idea is to make use of the turn restrictions context introduced in the speedy graph builder which already expands the turn restrictions into colored subgraphs. For the adjacency matrix in Tarjan's algorithm, I simply add end nodes of colored subgraphs to the adjacency list of the start node of a sequence. As the adjacency may be pruned in the course of the algorithm, the implementation re-iterates until two consecutive passes result in the same network size.
For this, the TurnRestriction context had to be refactored into its own package (is that okay? @mrieser @marecabo )
Tested on multiple large networks, the largest having ~1.2 million links and 600k nodes with roughly 1 million turn restrictions (a lot of forbidden u - turns). For this largest network, the algorithm had to pass in 15 iterations, which took about 30s.
The implementation is not multimodal so far. As it is currently implemented using a lot of deep recursion it might be required to increase the stack size of the jvm (eg., -Xss100m) (or we need to slightly adjust it to an iterative approach)
I haven't tested every "edge" (badumm tss) case, leaving this PR for discussion for now.