Added nordshift attribute #3877

daglem · 2023-08-05T20:26:57Z

The nordshift attribute on a packed array prohibits the generation of shift type circuits for reads from that array, akin to nowrshmsk for lvalue indexing.

To facilitate this, the AST transformations for rvalue indexing are moved from genrtlil.cc to simplify.cc, bringing them in line with transformations for lvalue indexing.

tests/techmap/shiftx2mux.ys

daglem · 2023-08-12T09:40:29Z

I did a rebase / squash to simplify any later comparison with genrtlil.cc

daglem · 2023-11-13T13:12:08Z

FIXME: tests/simple/partsel.v doesn't pass.

It is currently not possible to use AST_SHIFTX to generate the exact RTLIL $shiftx previously generated in genrtlil.cc. genrtlil.cc uses an intermediate "fake_ast" to trick binop2rtlil into generating RTLIL with different width for the result (Y) and sign for the shift value (B).

daglem · 2023-11-27T14:49:50Z

All tests now pass both with and without a forced use_case_method = true, except for, as expected for (* nordshift *), a few tests which count cells:

tests/arch/*/mux.ys
tests/arch/xilinx/xilinx_srl.ys
tests/svtypes/struct_dynamic_range.ys
tests/various/peepopt.ys

Also, for tests/techmap/shift2mux.ys to pass with (* nordshift *), it would have to run proc in order to handle the introduced CASE process.

daglem · 2023-11-27T20:39:57Z

Commenting in reg_demux_noshift in the Makefile in shift_issue3.zip in #3875 and running the tests via make now results in:

   Chip area for module '\reg_demux_noshift': 18689.907600
   Chip area for module '\reg_demux_nowrshmsk_mdim_arr': 19213.891200
   Chip area for module '\reg_demux_nowrshmsk_typedef': 19115.271000
   Chip area for module '\reg_demux_pow2': 5513.961600
   Chip area for module '\reg_demux': 21857.661000

reg_demux_noshift yields further savings, and the other results are exactly the same as in #3875, i.e. no regressions.

frontends/ast/simplify.cc

povik · 2024-02-22T16:03:58Z

@daglem I had hard time reviewing the code for the default case when the attribute is not in use, given that this PR touched on that too. I decided to rewrite that part to something that should be more obvious. Please see if it looks good to you.

povik · 2024-02-22T16:13:40Z

Not sure what the CI failure is about, the runners almost seem to be building from some other revision of the source since if you take e.g.

../frontends/ast/simplify.cc:2320:49: error: too many arguments to function call, expected 0, have 1
                AST::AstNode *member_node = get_struct_member(this);
                                            ~~~~~~~~~~~~~~~~~ ^~~~

that's not what you find on line 2320 in simplify.cc.

povik · 2024-02-23T14:00:37Z

Pushed a rebase to see if that helps with CI...

povik · 2024-02-23T14:23:10Z

It turns out the CI runs on the state of the tree after the PR is merged in, not on the head commit of the PR itself. That's of course desirable but caught me by surprise still.

daglem · 2024-02-26T21:20:47Z

@daglem I had hard time reviewing the code for the default case when the attribute is not in use, given that this PR touched on that too. I decided to rewrite that part to something that should be more obvious. Please see if it looks good to you.

Either way the branch doesn't play well with tests/arrays03.sv. The RTLIL for that test now doesn't seem to make sense - I'll try to investigate what's going wrong.

daglem · 2024-02-27T15:36:32Z

frontends/ast/simplify.cc

+			// Decode the index based on wire dimensions
+			int idx_signed_nbits = shift_expr_width_hint + !shift_expr_sign_hint;
+			if (!id2ast->range_swapped) {
+				int raw_idx_nbits = 1 + std::max(idx_signed_nbits, ceil_log2(std::abs(wire_offset) + 1) + 1);


Is the + 1 in ceil_log2(std::abs(wire_offset) + 1) really needed here? It's not included in similar code below.

daglem · 2024-02-27T15:38:52Z

frontends/ast/simplify.cc

+					new AstNode(AST_TO_SIGNED,
+						new AstNode(AST_CAST_SIZE, node_int(raw_idx_nbits), shift_expr)
+					),
+				node_int(wire_offset));


Can you avoid the AST_SUB here if wire_offset == 0?
This ends up in RTLIL even when it's not needed.

@povik I tested to only do the the offset calculations if id2ast->range_swapped || wire_offset, and that makes the test of unsigned offsets in partsel_test008 in tests/simple/partsel.v fail 😱
I now suspect that the implementation of $shiftx doesn't handle unsigned shifts correctly - would this be in techlibs/common/techmamp.v?
I'm all for improving this code, which I translated more or less directly from genrtlil.cc, however I'd like it to be completely understandable - having to sign an unsigned shift amount doesn't feel quite right 😅

@povik The plot thickens. Skipping the offset calculations as mentioned above, the following code in tests/simple/shiftx.v passes make -f ../tools/autotest.mk shiftx.v. However if you change the constant from 31'sd0 to 32'sd0, it fails!

module shiftx_test ( input [1:0] din, input [1:0] uoffset, output [1:0] dout ); assign dout = din[31'sd0 -uoffset +: 2]; endmodule

Holy cow! It seems like iverilog is actually the culprit here! 🤦‍♂️

Simplifying the test even further and running ../tools/autotest.sh -S 1 shiftx.v yields the same result for shiftx.out/shiftx_out_syn1 (iverilog via Yosys techmap) for both 31'sd0 and 32'sd0, while the results in shiftx.out/shiftx_out_ref (directly processed by iverilog) are incompatible.

module shiftx_test ( input [1:0] din, input uoffset, output [1:0] dout ); assign dout = din[31'sd0 - uoffset +: 2]; endmodule

../tools/cmp_tbdata shiftx_out_ref-32 shiftx_out_ref-31 Error in testbench output compare (line=10): -#OUT# 011 x 1x 1000 1 +#OUT# 011 x xx 1000 1

Here is a (currently not passing) test I cooked up for iverilog, which incidentally only one of the commercial simulators on EDA Playground handles correctly 🙈
That simulator also warns that it is truncating MSBs if uoffset is made wider than 32 bits, which seems reasonable.

/* * partsel_outside * Check that an unsigned integer offset in an indexed part-select is not converted to a signed integer. * This would yield incorrect out of bound results for part-selects like arr[idx*8 - uoffset +: 4]. */ module main; reg [1:0] arr = 1; int unsigned uoffset = '1; wire [1:0] outside = arr[uoffset +: 2]; initial begin #1 if (outside !== 'x) begin $display("FAILED -- out of bounds value %b != xx", outside); $finish; end $display("PASSED"); $finish; end endmodule // main

Can you avoid the AST_SUB here if wire_offset == 0?
This ends up in RTLIL even when it's not needed.

We can, but I am convinced to improve the code quality in the frontend, we shouldn't do any specialized optimizations that can be delegated to general machinery -- either to AST transformations (AST_SUB of zero can be removed, if the sizing effect is kept) or netlist passes (opt_expr should be able to remove the no-op subtraction later on).

I'm all for improving this code, which I translated more or less directly from genrtlil.cc, however I'd like it to be completely understandable - having to sign an unsigned shift amount doesn't feel quite right 😅

The rationale was the following: We want the result of the subtraction to be signed, since that's the general case, and for that to happen per the Verilog rules, we need both operands signed. Since the signed cast is after resizing, there shouldn't be any hazard of wraparound. Later optimizations can optimize away the sign bit and any extra higher bits, if they are superfluous, so this shouldn't affect QoR negatively.

Duly noted, but there is a limit to what spinoff problems I can work on 😅
Currently at least the AST transformations only optimize AST_ADD/AST_SUB if both operands are constant. In the case of removing addition or subtraction of 0, one would possibly want to optimize beforehand anyway, since it is not necessary to set aside an extra bit for overflow in this case. One could conceivably introduce new AST nodes for addition and subtraction with self-determined widths guaranteed not to overflow, but I digress.

In any case, testing the simplest possible AST here revealed a bug in iverilog, as far as I can tell. I believe the same kind of bug is actually present in Yosys as well. If I understand correctly, if the index expressions here are >= 32 bits, the additions and subtractions may cause the expressions to wrap around, causing actual vector bits instead of x bits to be returned:

yosys/frontends/verilog/verilog_parser.y

Lines 955 to 965 in e9cd6ca

'[' expr TOK_POS_INDEXED expr ']' {

$$ = new AstNode(AST_RANGE);

AstNode *expr = new AstNode(AST_SELFSZ, $2);

$$->children.push_back(new AstNode(AST_SUB, new AstNode(AST_ADD, expr->clone(), $4), AstNode::mkconst_int(1, true)));

$$->children.push_back(new AstNode(AST_ADD, expr, AstNode::mkconst_int(0, true)));

} |

'[' expr TOK_NEG_INDEXED expr ']' {

$$ = new AstNode(AST_RANGE);

AstNode *expr = new AstNode(AST_SELFSZ, $2);

$$->children.push_back(new AstNode(AST_ADD, expr, AstNode::mkconst_int(0, true)));

$$->children.push_back(new AstNode(AST_SUB, new AstNode(AST_ADD, expr->clone(), AstNode::mkconst_int(1, true)), $4));

This may not be a big deal for Yosys, however it's certainly not ideal for a simulator to return anything but x bits for out of bounds addresses.

In the case of removing addition or subtraction of 0, one would possibly want to optimize beforehand anyway, since it is not necessary to set aside an extra bit for overflow in this case.

I operate under the assumption that if this bit isn't necessary, it will get optimized away at the netlist stage anyway. If not, I fully agree we need to handle it.

In any case, testing the simplest possible AST here revealed a bug in iverilog, as far as I can tell.

Feel free to opt for any simple resolution of this PR, since the non-nordshift handling isn't the focus of it. If iverilog doesn't handle something correctly, we can disable that part of the test until it does. The fixed requirement is that the CI passes and that the tests are in a reasonable state, working around shortcomings of external tools should be fine.

OK. I've just made some minor simplifications/corrections to your code, keeping the expression widening and addition/subtraction even when it's not strictly needed. This just so happens to pass all iverilog tests both before and after the iverilog PR.

Only the extra widening seems to be left after optimization (not running synthesis), but I guess that doesn't matter much - the widths are already excessive to start with in many cases, since a minimum of 32 bits are used for any offset calculation. In any case excessive widening seems to be pruned after synthesis 👍

daglem · 2024-03-04T07:50:25Z

I converted this to draft while I attempt to make sense of things in iverilog.

daglem · 2024-03-05T20:38:27Z

I've made a pull request for iverilog - we'll see how that pans out: steveicarus/iverilog#1106

With those changes in place, the default AST_SHIFTX code can be simplified even further. I'll hold off commiting until the iverilog PR is (hopefully) merged.

I'll look into whether the AST_CASE code can also be simplified now - I suspect the iverilog PR would have saved me from a lot of struggles there 🙈

daglem · 2024-03-07T08:42:21Z

@povik I did a final force push to correct my latest correction 😅

Without the latest commit, one could just as well have calculated raw_idx_nbits as 1 + std::max(idx_signed_nbits, 32), since node_int creates a 32 bit wide constant.

Hopefully the AST_SHIFTX code now does exactly what you wanted.

povik · 2024-03-07T10:34:01Z

Without the latest commit, one could just as well have calculated raw_idx_nbits as 1 + std::max(idx_signed_nbits, 32), since node_int creates a 32 bit wide constant.

Yeah, I was aware the constant node is always 32 bits wide, but still put in the code to calculate raw_idx_nbits as the least amount of bits for the arithmetic to never overflow. I saw there was no harm in it, and thought we could eventually replace node_int to avoid it overflowing on constants that are more than 32-bit wide.

I will try to get back to this PR to give it a final read-through.

nordshift on a net or variable yields generation of muxes instead of shift circuits for dynamic rvalue indexing, akin to nowrshmsk for lvalue indexing. To facilitate this, the AST transformations for rvalue indexing are moved from genrtlil.cc to simplify.cc, bringing them in line with transformations for lvalue indexing.

… AST_SHIFTX AST_CAST_SIZE on the right operand caused an unsigned operand to be signed. This is corrected by handling the right operand like in AST_POW.

This also corrects the calculation of bit widths, using the new function min_bit_width.

daglem requested a review from zachjs as a code owner August 5, 2023 20:26

daglem mentioned this pull request Aug 5, 2023

part-selects ($shift, $shiftx) create huge MUX-trees #3833

Closed

daglem force-pushed the nordshift branch from dbeba71 to f1ca67a Compare August 6, 2023 05:41

povik reviewed Aug 7, 2023

View reviewed changes

tests/techmap/shiftx2mux.ys Outdated Show resolved Hide resolved

daglem force-pushed the nordshift branch 2 times, most recently from e68b0d7 to 5574fa8 Compare August 7, 2023 12:20

daglem mentioned this pull request Aug 8, 2023

Optimization of nowrshmsk #3875

Merged

daglem force-pushed the nordshift branch from 9ab2dd9 to 52b8b88 Compare August 8, 2023 12:24

phsauter mentioned this pull request Aug 8, 2023

peepopt: Add shiftadd pattern #3883

Merged

3 tasks

daglem force-pushed the nordshift branch from e06ab0f to d5cdd86 Compare August 12, 2023 09:37

daglem force-pushed the nordshift branch from d5cdd86 to edf8d4e Compare November 13, 2023 13:09

daglem marked this pull request as draft November 13, 2023 13:12

daglem force-pushed the nordshift branch from edf8d4e to 5542c52 Compare November 27, 2023 14:34

daglem force-pushed the nordshift branch from 5542c52 to 05e36d8 Compare November 27, 2023 15:01

daglem force-pushed the nordshift branch 4 times, most recently from c4f6855 to a4b558f Compare December 9, 2023 20:59

daglem mentioned this pull request Dec 12, 2023

Respect the sign of the right operand of AST_SHIFT and AST_SHIFTX #4065

Merged

daglem force-pushed the nordshift branch from 74af59a to 352a855 Compare December 12, 2023 16:34

daglem force-pushed the nordshift branch from 352a855 to cbbcbb8 Compare January 10, 2024 20:17

daglem marked this pull request as ready for review January 10, 2024 20:18

povik self-requested a review January 15, 2024 15:14

povik self-assigned this Jan 15, 2024

povik reviewed Feb 12, 2024

View reviewed changes

frontends/ast/simplify.cc Show resolved Hide resolved

povik force-pushed the nordshift branch from abdf66d to 90fd423 Compare February 23, 2024 14:00

daglem commented Feb 27, 2024

View reviewed changes

daglem marked this pull request as draft March 4, 2024 07:49

daglem marked this pull request as ready for review March 6, 2024 19:52

daglem force-pushed the nordshift branch from f6bbe47 to e1610ab Compare March 7, 2024 08:11

daglem force-pushed the nordshift branch 3 times, most recently from 39cf48a to e80dbfe Compare March 7, 2024 17:46

daglem mentioned this pull request Mar 10, 2024

Add support for unpacked structs #4180

Draft

daglem force-pushed the nordshift branch from 264993b to 5da55a8 Compare March 15, 2024 17:43

daglem and others added 9 commits September 18, 2024 21:51

Add torture test for (* nordshift *) and (* nowrshmsk *)

8f8a7b9

ast: Improve obviousness when rewriting dynamic indexing

608bc58

ast: Catch up with struct member changes

1597dfc

Make multidimensional packed arrays play nicely with new rvalue indexing

36257c3

Correct self-determined signedness for right operand of AST_SHIFT and…

7b2c299

… AST_SHIFTX AST_CAST_SIZE on the right operand caused an unsigned operand to be signed. This is corrected by handling the right operand like in AST_POW.

Add function for minimum number of bits required to store integer values

c67421e

Further simplify rewriting of dynamic indexing

5a63653

This also corrects the calculation of bit widths, using the new function min_bit_width.

Replace calls to ceil(log2(x)) with ceil_log2(x)

d6dd61d

daglem force-pushed the nordshift branch from 5da55a8 to d6dd61d Compare September 18, 2024 20:27

phsauter mentioned this pull request Sep 26, 2024

Efficient handling of patterns emitted by sv2v #4615

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added nordshift attribute #3877

Added nordshift attribute #3877

daglem commented Aug 5, 2023 •

edited

Loading

daglem commented Aug 12, 2023

daglem commented Nov 13, 2023

daglem commented Nov 27, 2023

daglem commented Nov 27, 2023

povik commented Feb 22, 2024

povik commented Feb 22, 2024

povik commented Feb 23, 2024

povik commented Feb 23, 2024

daglem commented Feb 26, 2024

daglem Feb 27, 2024

daglem Feb 27, 2024

daglem Feb 29, 2024

daglem Mar 1, 2024

daglem Mar 1, 2024

daglem Mar 2, 2024

povik Mar 6, 2024

povik Mar 6, 2024

daglem Mar 6, 2024

povik Mar 6, 2024

daglem Mar 6, 2024

daglem commented Mar 4, 2024

daglem commented Mar 5, 2024

daglem commented Mar 7, 2024

povik commented Mar 7, 2024

	'[' expr TOK_POS_INDEXED expr ']' {
	$$ = new AstNode(AST_RANGE);
	AstNode *expr = new AstNode(AST_SELFSZ, $2);
	$$->children.push_back(new AstNode(AST_SUB, new AstNode(AST_ADD, expr->clone(), $4), AstNode::mkconst_int(1, true)));
	$$->children.push_back(new AstNode(AST_ADD, expr, AstNode::mkconst_int(0, true)));
	} \|
	'[' expr TOK_NEG_INDEXED expr ']' {
	$$ = new AstNode(AST_RANGE);
	AstNode *expr = new AstNode(AST_SELFSZ, $2);
	$$->children.push_back(new AstNode(AST_ADD, expr, AstNode::mkconst_int(0, true)));
	$$->children.push_back(new AstNode(AST_SUB, new AstNode(AST_ADD, expr->clone(), AstNode::mkconst_int(1, true)), $4));

Added nordshift attribute #3877

Are you sure you want to change the base?

Added nordshift attribute #3877

Conversation

daglem commented Aug 5, 2023 • edited Loading

daglem commented Aug 12, 2023

daglem commented Nov 13, 2023

daglem commented Nov 27, 2023

daglem commented Nov 27, 2023

povik commented Feb 22, 2024

povik commented Feb 22, 2024

povik commented Feb 23, 2024

povik commented Feb 23, 2024

daglem commented Feb 26, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

daglem commented Mar 4, 2024

daglem commented Mar 5, 2024

daglem commented Mar 7, 2024

povik commented Mar 7, 2024

daglem commented Aug 5, 2023 •

edited

Loading