Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RR_GRAPH] Fixed Bug in LLVM17 Build #2522

Merged

Conversation

AlexandreSinger
Copy link
Contributor

@AlexandreSinger AlexandreSinger commented Apr 1, 2024

The std::sort function requires operands to be "pointer-like", this requires that de-referencing a const pointer should be allowed.

This was not an issue in the past, but in LLVM17 this property is required. The edge_sort_iterator did not follow this property. This has been fixed.

Description

See issue #2517

The LLVM17 build failed due to a const edge_sort_iterator unable to be de-referenced. This PR fixes that issue.

The problem is that this required the swapper_ member to be a pointer to allow this to work. This means that the swapper_ object needs to be declared on the heap. This may have affects on the performance of this implementation. To prevent regression in performance, this needs to be profiled to ensure we do not slow down the current high-priority builds for future builds.

Related Issue

issue #2517

How Has This Been Tested?

Ran the quick and hard regression tests.

Still need to profile to ensure the performance does not regress.

@github-actions github-actions bot added the lang-cpp C/C++ code label Apr 1, 2024
@heshpdx
Copy link
Contributor

heshpdx commented Apr 1, 2024

Thanks for this! I have not yet checked if this patch fixes the LLVM-17 build issue, but I will send it over to the tester very soon. But it does build and run for gcc-12 and llvm-15, which is what I have on my system. So, I checked the performance with this patch on the cmdlines that @amin1377 and I came up with for SPEC CPUv8. Indeed, there is about a 10% perf loss across a variety of configs/builds.

Here is the data reported in seconds, on an aarch64 based Ampere Altra 3.0 Ghz system running Ubuntu 22.04.

O3:  gcc-12.3     -g -O3    -mcpu=native -fno-fast-math
Of:  gcc-12.3     -g -Ofast -mcpu=native -fno-fast-math -flto=32
L15: clang-15.0.7 -g -O3                 -fno-fast-math

baseline  O3    Of    L15
--------------------------
place1    63    59     65
route1    98    84    100
place2    79    71     79
route2    64    53     63

w/patch   O3    Of    L15
--------------------------
place1    76    63     75
route1   108    88    107
place2    87    75     86
route2    69    57     68

ratio     O3    Of    L15
--------------------------
place1   1.21  1.07  1.15
route1   1.10  1.05  1.07
place2   1.10  1.06  1.09
route2   1.08  1.08  1.08

@AlexandreSinger
Copy link
Contributor Author

I spent the last day running the large Titan benchmarks, and confirmed what @heshpdx has found:
image

An 11% increase in overall runtime, no change in QoR. Unless we can find a more efficient way to make the de-reference operator a const method in this class, I do not think this can be merged into VPR master.

We could IFDEF the old version with the new version based on the version of C++ being used to compile, as an idea.

@heshpdx
Copy link
Contributor

heshpdx commented Apr 2, 2024

We confirmed that the patch does allow compilation on llvm-17, and also confirmed a perf slowdown.

Can you try reverting the patch and then converting all of the instances of std::sort to std::stable_sort? Because that also allows build with llvm-17 and doesn't incur the performance slowdown. It is arguably better since it would provide same results between standard library implementations, which is a good thing.

The std::sort function requires operands to be "pointer-like", this
requires that de-referencing a const pointer should be allowed.

This was not an issue in the past, but in LLVM17 this property is
required. The edge_sort_iterator did not follow this property.

Fixing this by naively allowing this property led to a 10% performance
degredation due to requiring pointers to be allocated and deallocated.

To fix this for now, made the sort use std::stable_sort instead. The
performance degredation is quite small and may be worth it.
@AlexandreSinger AlexandreSinger force-pushed the feature-llvm-17-build branch from f4b6694 to 973acb7 Compare April 2, 2024 23:35
@AlexandreSinger
Copy link
Contributor Author

AlexandreSinger commented Apr 2, 2024

@heshpdx I have reverted the changes and converted the std::sorts to std::stable_sorts. I have also rerun the profiling of the large Titan benchmarks and have shown only a minor slowdown of around 1%.
image

I have also added a comment to the code to explain this issue to anyone who comes across this again in the future; since I do not think there is a guarantee that even stable sort would not make one of its references to its iterators to being const in the future.

@vaughnbetz What do you think about this change?

@vaughnbetz
Copy link
Contributor

Thanks. This change seems OK to me.

@vaughnbetz vaughnbetz merged commit 72af920 into verilog-to-routing:master Apr 3, 2024
52 checks passed
@heshpdx
Copy link
Contributor

heshpdx commented Apr 3, 2024

FYI, for SPEC CPUv8 we changed ALL of the sort's into stable_sort's, everywhere in the source. Maybe it was overkill, but it's a defensive move.

@vaughnbetz
Copy link
Contributor

@amin1377 : could you try running a cpu time experiment with the remaining sorts turned into stable sorts? If it doesn't hurt cpu time, we can convert them all. I'd like to keep the master in sync with SPEC CPUv8, with #ifdef SPEC or some such guarding any minor changes we can't turn on for the master (e.g. making sure all the base_costs are integral).

@amin1377
Copy link
Contributor

amin1377 commented Apr 3, 2024

I've already submitted a PR for this, and I'm currently waiting for the completion of the runs to gather the Quality of Results (QoR).

@AlexandreSinger AlexandreSinger deleted the feature-llvm-17-build branch May 28, 2024 18:45
@heshpdx
Copy link
Contributor

heshpdx commented Aug 27, 2024

I believe the patch below is the real fix. I can confirm that this builds for LLVM17 (and 18 and 19.1.0-rc3). BTW we have found which sorts need to be stable and which ones do not; when I get time I can share a patch that recovers performance.

--- a/src/vtr-vpr/libs/librrgraph/src/base/rr_graph_storage.cpp
+++ b/src/vtr-vpr/libs/librrgraph/src/base/rr_graph_storage.cpp
@@ -154,7 +154,7 @@ class edge_sort_iterator {
     using pointer = edge_swapper*;
     using difference_type = ssize_t;
 
-    edge_swapper& operator*() {
+    edge_swapper& operator*() const {
         return this->swapper_;
     }
 
@@ -229,7 +229,7 @@ class edge_sort_iterator {
     }
 
   private:
-    edge_swapper swapper_;
+    mutable edge_swapper swapper_;
 };
 
 class edge_compare_src_node_and_configurable_first {

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lang-cpp C/C++ code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants