Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Net decomposition: tuning and polishing #2516

Merged
merged 1 commit into from
Nov 26, 2024
Merged

Conversation

duck2
Copy link
Contributor

@duck2 duck2 commented Mar 23, 2024

Paper version of the net decomposing router. Major fixes are:

  1. Don't count pruned sinks towards high fanout case's chan_nodes_added
  2. Add a minimum bin size for the HF case to prevent minuscule bins when flat routing is enabled
  3. Incrementally update the PartitionTree instead of rebuilding on every iteration

@github-actions github-actions bot added VPR VPR FPGA Placement & Routing Tool lang-cpp C/C++ code labels Mar 23, 2024
@duck2 duck2 force-pushed the partition_subtree branch 2 times, most recently from 7fcea0f to 4e9b58f Compare October 25, 2024 20:24
@duck2 duck2 marked this pull request as ready for review October 25, 2024 20:24
@vaughnbetz
Copy link
Contributor

vaughnbetz commented Oct 25, 2024

Also add tests for the net decomposing router. One small circuit in strong, so it is checked all the time. Should have a run on bigger designs too in one of the nightly tests.

@vaughnbetz
Copy link
Contributor

Should try to land this before the code format PR if at all possible. Fahri thinks this can be landed in 2 weeks (mid-November) but that it shouldn't block a code format PR.

@duck2 duck2 force-pushed the partition_subtree branch 2 times, most recently from 69304fe to f67a224 Compare November 14, 2024 17:41
@vaughnb-cerebras
Copy link

@duck2 : can you add the QoR data as it becomes available? We'll need: net-decomposing router, serial router, master router anyway (your call if you also want to check the baseline router to be safe)

@duck2 duck2 force-pushed the partition_subtree branch from f67a224 to 40814f8 Compare November 21, 2024 23:29
@duck2
Copy link
Contributor Author

duck2 commented Nov 25, 2024

QoR for two stage routing:

titan_10_master_serial_j4.txt titan_10_serial_j4.txt titan_10_baseline_j4.txt titan_10_decomp_j4.txt
vtr_flow_elapsed_time 1 1,006176477 0,8244266967 0,7997656644
num_LAB 1 1 1 1
num_DSP 1 1 1 1
num_M9K 1 1 1 1
num_M144K 1 1 1 1
max_vpr_mem 1 1,000139035 1,005887993 1,007217148
num_pre_packed_blocks 1 1 1 1
num_post_packed_blocks 1 1 1 1
device_grid_tiles 1 1 1 1
pack_time
placed_wirelength_est
place_time 1 1,026093534 1,005708623 1,006580384
placed_CPD_est
routed_wirelength 1 0,9996292703 0,9994574449 1,00444525
critical_path_delay 1 1,006455254 1,004412981 1,024264708
geomean_nonvirtual_intradomain_critical_path_delay 1 0,9996147717 1,008555116 1,019565731
crit_path_route_time 1 1,00338813 0,6495137537 0,5987187888

@duck2
Copy link
Contributor Author

duck2 commented Nov 25, 2024

I checked for increased memory usage with flat router. Doesn't seem to be there anymore. Running flat router experiments

Copy link
Contributor

@vaughnbetz vaughnbetz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for addressing the comments. 2-stage routing results look good. Just need the flat results and we can merge.

@vaughnbetz
Copy link
Contributor

Adding @MohamedElgammal. @duck2 : please keep us posted on the flat routing results. If they look good, we can merge.

@duck2
Copy link
Contributor Author

duck2 commented Nov 25, 2024

We may speculatively merge this and skip waiting for flat router results if you want. It seems to be taking a while.

@vaughnbetz
Copy link
Contributor

@MohamedElgammal : let me know if this is holding you up and I can speculatively merge it. @duck2 : please post updates on how flat routing is looking as you have them.

@duck2
Copy link
Contributor Author

duck2 commented Nov 26, 2024

Flat router results:

(I ran this on my own PC, so the runtime may be noisy)

titan_10_master_serial_flat_j4.txt titan_10_serial_flat_j4.txt
vtr_flow_elapsed_time 1 0,9627723754
num_LAB 1 1
num_DSP 1 1
num_M9K 1 1
num_M144K 1 1
max_vpr_mem 1 0,9997809869
num_pre_packed_blocks 1 1
num_post_packed_blocks 1 1
device_grid_tiles 1 1
pack_time
placed_wirelength_est
place_time 1 0,9601079574
placed_CPD_est
routed_wirelength 1 0,9996585301
critical_path_delay 1 0,9835295967
geomean_nonvirtual_intradomain_critical_path_delay 1 0,9934141107
crit_path_route_time 1 0,9650722987

@vaughnbetz
Copy link
Contributor

Flat routing (serial) has improved with @duck2 's changes. (Yaay!). Merging this, as Fahri has separately already verified the flat routing parallel works better with this change.
Fahri, as the flat routing results come in please add them to this PR so we have a record of them.
@MohamedElgammal : this is merged now in case you want to start VTR 9 QoR runs.

@vaughnbetz vaughnbetz merged commit c246f54 into master Nov 26, 2024
37 checks passed
@vaughnbetz vaughnbetz deleted the partition_subtree branch November 26, 2024 16:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lang-cpp C/C++ code VPR VPR FPGA Placement & Routing Tool
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants