Round compressed grid locations #2509

soheilshahrouz · 2024-03-18T16:19:14Z

When translating a grid location to a compressed grid location, we call std::lower_bound() to find the corresponding compressed location. Assume that we want to translate the grid lcoation (66, 67) for a DSP block in a grid with DSP columns at x=50 and x=70. The current grid_loc_to_compressed_loc_approx() would select the column at x=50, while the one at x=70 was closer. This PR changes grid_loc_to_compressed_loc_approx() to choose the closest compressed location instead.

In calculate_centroid_loc(), we cast float variables to int, which is equivalent to taking their floor. This means that the centroid layer is always zero unless all connected pins are on layer 1.

How Has This Been Tested?

I ran experiments on Titan benchmarks. I'll post QoR summary in a comment in this PR. If the results are promising, we can experiment with other benchmarks and architectures.

soheilshahrouz · 2024-03-18T17:08:44Z

Titan benchmarks

	vtr_flow_elapsed_time	pack_time	placed_wl	total_swap	place_time	placed_cpd	routed_wl	heap_pop	route_cpd
master	2858.225484	398.634554	2572680.235	18027807.27	1153.033348	16.20830923	3213503.781	153895598.5	17.3042333
PR	2757.462696	393.088044	2528556.546	18024886.98	1122.98716	16.28941875	3161054.637	139388529.9	17.39327478
Ratio	0.9647463824	0.9860862291	0.9828491361	0.9998380121	0.973941614	1.005004194	0.983678518	0.9057343496	1.005145647

Link to QoR spreadsheet

vaughnbetz · 2024-03-18T21:35:36Z

QoR looks promising ... will need to go through and check the various test failures to make sure they are small circuit QoR changes that are neutral or irrelevant.
I suggest also running Koios for QoR.

vaughnbetz · 2024-03-18T21:36:04Z

(Your idea): could also run a 3D arch regtest (your call).

soheilshahrouz · 2024-05-08T18:55:03Z

vtr_reg_nightly_test1 and vtr_reg_nightly_test1_odin failures are fixed.

vtr_reg_nightly_test3_odin hits the 7 hours executin time limit, which is a little strange because vtr_reg_nightly_test3 finishes in ~1 hour.

vaughnbetz · 2024-05-13T22:01:39Z

If this looks like a random failure we could move some test out of vtr_reg_nightly_test3_odin (and put it in another suite, or tweak the designs used).

soheilshahrouz · 2024-05-31T18:36:30Z

I ran nightly3_odin tasks on wintermute. There is no significant execution time difference between this branch and the master branch on wintermute.

vaughnbetz · 2024-05-31T19:06:31Z

OK, if you resolve the conflicts we can merge. They look like a bunch of golden result conflicts, so hopefully not too hard to resolve.

Maybe the odin II run is at the edge of 6 hours and we need to split it?

soheilshahrouz · 2024-05-31T19:51:41Z

Maybe the odin II run is at the edge of 6 hours and we need to split it?

I am running tasks in nightly3_odin as different CI tests to see how long each of them takes. On master, the entire test takes ~3.5 hours. If we are at the edge of the 7 hours time limit, it means that the execution time has doubled.

soheilshahrouz · 2024-05-31T20:19:10Z

The mcml circuit in vtr_reg_qor_chain_predictor_off has an execution time about 5 times longer than the master branch. Finding the minimum channel width takes 38 times longer than on the master branch.

I missed this unusual increase in runtime because I was using the geomean to compare the execution time of two branches. Since only a single circuit's runtime increased significantly, the geomean value didn't change much.

soheilshahrouz · 2024-05-31T21:02:15Z

vpr.txt

It seems that the router tries to route the design at a channel width much lower than the minimum channel width. Since routing_failure_predictor is off, it does not terminate the router even when each iteration takes ~500 seconds. Is this behaviour expected?

vaughnbetz · 2024-06-01T01:21:44Z

Yes, that can happen. With the routing predictor off, a minimum channel width search could take a long time. We could just take mcml out of the test -- probably doing a minimum channel width search with larger circuits and the predictor off is not a great idea.

…est3_odin With routing failure predictor turned off, finding the minimum channel width takes for this circuit takes so long that the CI test failed.

vaughnbetz · 2024-06-03T21:38:27Z

Update: @soheilshahrouz has removed mcml from the problematic (slow) test. Need to remove it from the golden results:

2024-06-03T19:56:18.0542628Z �[32;1m19:56:18�[0m | regression_tests/vtr_reg_nightly_test3_odin/vtr_reg_qor_chain_depop...[Pass]
2024-06-03T19:56:18.0545553Z �[32;1m19:56:18�[0m | Error: Required case k6_frac_N10_frac_chain_mem32K_40nm.xml/mcml.v missing from task results: /root/vtr-verilog-to-routing/vtr-verilog-to-routing/vtr_flow/tasks/regression_tests/vtr_reg_nightly_test3_odin/vtr_reg_qor_chain_predictor_off/run001/parse_results.txt
2024-06-03T19:56:18.0547610Z �[32;1m19:56:18�[0m |
2024-06-03T19:56:18.0548391Z �[32;1m19:56:18�[0m | Test 'vtr_reg_nightly_test3_odin' had -1 qor test failures
2024-06-03T19:56:18.0549325Z �[32;1m19:56:18�[0m |
2024-06-03T19:56:18.0550053Z �[32;1m19:56:18�[0m | Test 'vtr_reg_nightly_test3_odin' had 0 run failures

Also some minimum channel width increases (out of the allowed bounds) on a few small circuits, which is probably due to overwriting new golden results with the results from this PR. Checking that all those failures are in that class and updating golden should fix that.

vaughnbetz · 2024-06-04T01:29:37Z

Whoops, it looks like another merge I did nuked this. The error is that an enum was changed to an enum class.

/root/vtr-verilog-to-routing/vtr-verilog-to-routing/vpr/src/pack/cluster_util.cpp:1248:20: error: 'BLK_FAILED_FEASIBLE' was not declared in this scope; did you mean 'e_block_pack_status::BLK_FAILED_FEASIBLE

If you change that line to return e_block_pack_status::BLK_FAILED_FEASIBLE this PR should work again Kimia.

vaughnbetz · 2024-06-06T17:50:09Z

Looks like we're getting disk quota issues in CI. @AlexandreSinger : I think you were about to change the artifact collection settings? That might be getting time critical.

2024-06-06T17:14:03.2100495Z Requested labels: self-hosted, Linux, X64
2024-06-06T17:14:03.2100777Z Job defined at: verilog-to-routing/vtr-verilog-to-routing/.github/workflows/test.yml@refs/heads/round_comp_loc
2024-06-06T17:14:03.2100881Z Waiting for a runner to pick up this job...
2024-06-06T17:14:04.1713331Z Job is about to start running on the runner: gh-actions-runner-vtr_199 (repository)
2024-06-06T17:14:08.1524125Z �[32;1m17:14:08�[0m | Attempting to spawn a machine... (PID: 11569)
2024-06-06T17:14:08.1534275Z �[32;1m17:14:08�[0m | Instance name: gh-actions-runner-vtr-auto-spawned199
2024-06-06T17:14:08.1537539Z �[32;1m17:14:08�[0m | Instance type: n2-highmem-16
2024-06-06T17:14:08.1539870Z �[32;1m17:14:08�[0m | Disk type: pd-ssd
2024-06-06T17:14:08.1543555Z �[32;1m17:14:08�[0m | LABELS: [{'github_job_full': 'run_tests_vtr_reg_nightly_test3_16_', 'github_sha': 'efe606d49f7b7bbb45b42f6c829fdca63370eafb', 'github_run_id': '9405078419'}]
2024-06-06T17:14:08.1548843Z �[32;1m17:14:08�[0m | Preemptible: False
2024-06-06T17:14:10.9444418Z �[32;1m17:14:10�[0m | Attempting to spawn a machine in us-central1-a
2024-06-06T17:14:30.9285699Z �[32;1m17:14:30�[0m | Error occured while spawning the instance: {'code': 'QUOTA_EXCEEDED', 'message': "Quota 'DISKS_TOTAL_GB' exceeded. Limit: 102400.0 in region us-central1.", 'errorDetails': [{'quotaInfo': {'metricName': 'compute.googleapis.com/disks_total_storage', 'limitName': 'DISKS-TOTAL-GB-per-project-region', 'dimensions': {'region': 'us-central1'}, 'limit': 102400}}]}
2024-06-06T17:14:30.9609730Z �[32;1m17:14:30�[0m |
2024-06-06T17:14:32.0701444Z �[32;1m17:14:32�[0m |
2024-06-06T17:14:33.5684742Z �[32;1m17:14:33�[0m | Attempting to spawn a machine... (PID: 12463)
2024-06-06T17:14:33.5709456Z �[32;1m17:14:33�[0m | Instance name: gh-actions-runner-vtr-auto-spawned199
2024-06-06T17:14:33.5711162Z �[32;1m17:14:33�[0m | Instance type: n2-highmem-16
2024-06-06T17:14:33.5712307Z �[32;1m17:14:33�[0m | Disk type: pd-ssd
2024-06-06T17:14:33.5714142Z �[32;1m17:14:33�[0m | LABELS: [{'github_job_full': 'run_tests_vtr_reg_nightly_test3_16_', 'github_sha': 'efe606d49f7b7bbb45b42f6c829fdca63370eafb', 'github_run_id': '9405078419'}]
2024-06-06T17:14:33.5716472Z �[32;1m17:14:33�[0m | Preemptible: False
2024-06-06T17:14:36.1309915Z �[32;1m17:14:36�[0m | Attempting to spawn a machine in us-central1-a
2024-06-06T17:15:00.0817421Z �[32;1m17:15:00�[0m | Error occured while spawning the instance: {'code': 'QUOTA_EXCEEDED', 'message': "Quota 'DISKS_TOTAL_GB' exceeded. Limit: 102400.0 in region us-central1.", 'errorDetails': [{'quotaInfo': {'metricName': 'compute.googleapis.com/disks_total_storage', 'limitName': 'DISKS-TOTAL-GB-per-project-region', 'dimensions': {'region': 'us-central1'}, 'limit': 102400}}]}
2024-06-06T17:15:00.1133073Z �[32;1m17:15:00�[0m |
2024-06-06T17:15:01.3026225Z �[32;1m17:15:01�[0m |
2024-06-06T17:15:02.7894178Z �[32;1m17:15:02�[0m | Attempting to spawn a machine... (PID: 13558)
2024-06-06T17:15:02.7922919Z �[32;1m17:15:02�[0m | Instance name: gh-actions-runner-vtr-auto-spawned199
2024-06-06T17:15:02.7924375Z �[32;1m17:15:02�[0m | Instance type: n2-highmem-16
2024-06-06T17:15:02.7925335Z �[32;1m17:15:02�[0m | Disk type: pd-ssd
2024-06-06T17:15:02.7926921Z �[32;1m17:15:02�[0m | LABELS: [{'github_job_full': 'run_tests_vtr_reg_nightly_test3_16_', 'github_sha': 'efe606d49f7b7bbb45b42f6c829fdca63370eafb', 'github_run_id': '9405078419'}]
2024-06-06T17:15:02.7929127Z �[32;1m17:15:02�[0m | Preemptible: False
2024-06-06T17:15:05.2320704Z �[32;1m17:15:05�[0m | Attempting to spawn a machine in us-central1-a
2024-06-06T17:15:23.9091269Z �[32;1m17:15:23�[0m | Error occured while spawning the instance: {'code': 'QUOTA_EXCEEDED', 'message': "Quota 'DISKS_TOTAL_GB' exceeded. Limit: 102400.0 in region us-central1.", 'errorDetails': [{'quotaInfo': {'metricName': 'compute.googleapis.com/disks_total_storage', 'limitName': 'DISKS-TOTAL-GB-per-project-region', 'dimensions': {'region': 'us-central1'}, 'limit': 102400}}]}
2024-06-06T17:15:23.9413133Z �[32;1m17:15:23�[0m |
2024-06-06T17:15:24.9601691Z �[32;1m17:15:24�[0m |
2024-06-06T17:15:26.2297039Z �[32;1m17:15:26�[0m | Attempting to spawn a machine... (PID: 14339)
2024-06-06T17:15:26.2320467Z �[32;1m17:15:26�[0m | Instance name: gh-actions-runner-vtr-auto-spawned199
2024-06-06T17:15:26.2321782Z �[32;1m17:15:26�[0m | Instance type: n2-highmem-16
2024-06-06T17:15:26.2322904Z �[32;1m17:15:26�[0m | Disk type: pd-ssd
2024-06-06T17:15:26.2324603Z �[32;1m17:15:26�[0m | LABELS: [{'github_job_full': 'run_tests_vtr_reg_nightly_test3_16_', 'github_sha': 'efe606d49f7b7bbb45b42f6c829fdca63370eafb', 'github_run_id': '9405078419'}]
2024-06-06T17:15:26.2328944Z �[32;1m17:15:26�[0m | Preemptible: False
2024-06-06T17:15:29.1885332Z �[32;1m17:15:29�[0m | Attempting to spawn a machine in us-central1-a
2024-06-06T17:15:49.8489609Z �[32;1m17:15:49�[0m | Error occured while spawning the instance: {'code': 'QUOTA_EXCEEDED', 'message': "Quota 'DISKS_TOTAL_GB' exceeded. Limit: 102400.0 in region us-central1.", 'errorDetails': [{'quotaInfo': {'metricName': 'compute.googleapis.com/disks_total_storage', 'limitName': 'DISKS-TOTAL-GB-per-project-region', 'dimensions': {'region': 'us-central1'}, 'limit': 102400}}]}
2024-06-06T17:15:49.8850793Z �[32;1m17:15:49�[0m |
2024-06-06T17:15:51.0529951Z �[32;1m17:15:51�[0m |
2024-06-06T17:15:52.3631606Z �[32;1m17:15:52�[0m | Attempting to spawn a machine... (PID: 16229)
2024-06-06T17:15:52.3658954Z �[32;1m17:15:52�[0m | Instance name: gh-actions-runner-vtr-auto-spawned199
2024-06-06T17:15:52.3660375Z �[32;1m17:15:52�[0m | Instance type: n2-highmem-16
2024-06-06T17:15:52.3661452Z �[32;1m17:15:52�[0m | Disk type: pd-ssd
2024-06-06T17:15:52.3663496Z �[32;1m17:15:52�[0m | LABELS: [{'github_job_full': 'run_tests_vtr_reg_nightly_test3_16_', 'github_sha': 'efe606d49f7b7bbb45b42f6c829fdca63370eafb', 'github_run_id': '9405078419'}]
2024-06-06T17:15:52.3665604Z �[32;1m17:15:52�[0m | Preemptible: False
2024-06-06T17:15:54.6993528Z �[32;1m17:15:54�[0m | Attempting to spawn a machine in us-central1-a
2024-06-06T17:16:14.8833117Z �[32;1m17:16:14�[0m | Error occured while spawning the instance: {'code': 'QUOTA_EXCEEDED', 'message': "Quota 'DISKS_TOTAL_GB' exceeded. Limit: 102400.0 in region us-central1.", 'errorDetails': [{'quotaInfo': {'metricName': 'compute.googleapis.com/disks_total_storage', 'limitName': 'DISKS-TOTAL-GB-per-project-region', 'dimensions': {'region': 'us-central1'}, 'limit': 102400}}]}
2024-06-06T17:16:14.9192247Z �[32;1m17:16:14�[0m |
2024-06-06T17:16:16.0565190Z �[32;1m17:16:16�[0m |
2024-06-06T17:16:17.0661708Z ##[error]VM starter exited with non-zero exit code: 1

AlexandreSinger · 2024-06-06T17:56:20Z

@vaughnbetz Perhaps. However, I think the artifacts are stored on the GitHub servers not the self-hosted runners.

My theory is that we have too many CI runs in flight right now. We currently have 12 CI runs in flight, each one running 17 independent tests (running on different self-hosted runners), and we have 100 self-hosted runners. I do not think its a coincidence that we are starting to hit issues when the number of runners required goes above 100.

I think once we change the CI triggers it should help reduce the load.

…rea_per_tile for vtr_reg_nightly_test1_odin/arithmetic_tasks/multless_consts/fixed_k6_frac_2ripple_N8_22nm.xml/mult_007.v

soheilshahrouz · 2024-06-11T21:35:05Z

@vaughnbetz
I updated the golden results for CI tests that were failing for out of bound QoR values. Now, all CI tests pass.

echo_compressed_grids() only printed the first element in each compressed grid axis 3D support added a new dimension to each compressed grid axis. This new dimension was not indexed properly in echo_compressed_grids()

soheilshahrouz added 6 commits February 28, 2024 14:21

do not consider ignored nets in centroid

a01ad1c

round centroid point

b2bc727

round grid_loc_to_compressed_loc_approx()

eb99157

fix overshadowed declaration

106a52a

update golden restuls for failing strong tests

dea9915

Merge branch 'master' into round_comp_loc

d3daa98

github-actions bot added VPR VPR FPGA Placement & Routing Tool lang-cpp C/C++ code labels Mar 18, 2024

vaughnbetz and others added 5 commits April 4, 2024 12:41

Merge branch 'master' into round_comp_loc

2dc3760

updated golden

c1118ca

update golden

d3d8c32

try to fix vtr_reg_nightly_test1 and vtr_reg_nightly_test1_odin failures

677c4eb

Update golden_results in strong_target_pin_util

c958361

soheilshahrouz added 2 commits June 3, 2024 11:34

remove mmcl from vtr_reg_qor_chain_predictor_off in vtr_reg_nightly_t…

ba66b12

…est3_odin With routing failure predictor turned off, finding the minimum channel width takes for this circuit takes so long that the CI test failed.

Merge branch 'master' into round_comp_loc

898fdd5

remove mcml from the golden results

aa131f9

Merge branch 'master' into round_comp_loc

c3cf18c

Merge branch 'master' into round_comp_loc

efe606d

soheilshahrouz added 3 commits June 7, 2024 17:52

update golden results

89b6bf9

update route_chan_width in vtr_timing_update_diff

d6b7dff

update min_chan_width_routing_area_total and min_chan_width_routing_a…

55e2328

…rea_per_tile for vtr_reg_nightly_test1_odin/arithmetic_tasks/multless_consts/fixed_k6_frac_2ripple_N8_22nm.xml/mult_007.v

soheilshahrouz added 3 commits June 13, 2024 13:46

create a grid in test_compressed_grid

88b3970

fix echo_compressed_grids

cc2e0c7

echo_compressed_grids() only printed the first element in each compressed grid axis 3D support added a new dimension to each compressed grid axis. This new dimension was not indexed properly in echo_compressed_grids()

unit test for grid_loc_to_compressed_loc_approx()

c9e423f

github-actions bot added the libarchfpga Library for handling FPGA Architecture descriptions label Jun 14, 2024

vaughnbetz merged commit 871b28a into master Jun 18, 2024
102 checks passed

vaughnbetz deleted the round_comp_loc branch June 18, 2024 19:59

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Round compressed grid locations #2509

Round compressed grid locations #2509

soheilshahrouz commented Mar 18, 2024

soheilshahrouz commented Mar 18, 2024

vaughnbetz commented Mar 18, 2024

vaughnbetz commented Mar 18, 2024

soheilshahrouz commented May 8, 2024

vaughnbetz commented May 13, 2024

soheilshahrouz commented May 31, 2024

vaughnbetz commented May 31, 2024

soheilshahrouz commented May 31, 2024

soheilshahrouz commented May 31, 2024

soheilshahrouz commented May 31, 2024 •

edited

Loading

vaughnbetz commented Jun 1, 2024

vaughnbetz commented Jun 3, 2024

vaughnbetz commented Jun 4, 2024

vaughnbetz commented Jun 6, 2024

AlexandreSinger commented Jun 6, 2024 •

edited

Loading

soheilshahrouz commented Jun 11, 2024

Round compressed grid locations #2509

Round compressed grid locations #2509

Conversation

soheilshahrouz commented Mar 18, 2024

How Has This Been Tested?

soheilshahrouz commented Mar 18, 2024

vaughnbetz commented Mar 18, 2024

vaughnbetz commented Mar 18, 2024

soheilshahrouz commented May 8, 2024

vaughnbetz commented May 13, 2024

soheilshahrouz commented May 31, 2024

vaughnbetz commented May 31, 2024

soheilshahrouz commented May 31, 2024

soheilshahrouz commented May 31, 2024

soheilshahrouz commented May 31, 2024 • edited Loading

vaughnbetz commented Jun 1, 2024

vaughnbetz commented Jun 3, 2024

vaughnbetz commented Jun 4, 2024

vaughnbetz commented Jun 6, 2024

AlexandreSinger commented Jun 6, 2024 • edited Loading

soheilshahrouz commented Jun 11, 2024

soheilshahrouz commented May 31, 2024 •

edited

Loading

AlexandreSinger commented Jun 6, 2024 •

edited

Loading