Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gowin: add mux techmap and whitebox #4004

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

adrianparvino
Copy link

No description provided.

@povik
Copy link
Member

povik commented Oct 17, 2023

Let's see if I understand the effect here vis-a-vis the gowin synthesis script.

$_MUX4_ and such can reach the techmap -map +/gowin/cells_map.v step if and only if they are in the input netlist. They won't be emitted by the generic techmap call that's earlier in the script since that's something the generic techmap doesn't use, and they won't be mopped up by abc (per default) since that's conditioned on the map_mux4/8/16 flags.

Do we have an idea of what to expect with respect to QoR? Isn't the optimized mux gate mapping here what ABC should come up with on its own anyway if it were smart enough (with -map_mux4 or such passed)? There could be scenarios where it would be better if a piece of the surrounding logic was obsorbed into a LUT that's also implementing the mux, but we wouldn't allow for that with the fixed mapping.

@adrianparvino
Copy link
Author

adrianparvino commented Oct 17, 2023

Do we have an idea of what to expect with respect to QoR? Isn't the optimized mux gate mapping here what ABC should come up with on its own anyway if it were smart enough (e.g. with -map_mux4 passed)? There could be scenarios where it would be better if a piece of the surrounding logic was obsorbed into a LUT that's also implementing the mux, but we wouldn't allow for that with the fixed mapping.

I actually feel slightly pessimistic about this PR specifically as mentioned, ABC (or ABC9) can generate better results provided that they are mux-aware. I'm thinking if it's instead better to teach ABC9 that LUT4 + LUT4 + MUX2_LUT5 exists, and is not simply a LUT5. My current thoughts on this is that this is not perfect, and in fact pessimizes more often than not, but if you in fact need a MUX8, this is the only way to generate it, at least with an ABC9 flow.

@povik
Copy link
Member

povik commented Oct 17, 2023

I'm thinking if it's instead better to teach ABC9 that LUT4 + LUT4 + MUX2_LUT5 exists, and is not simply a LUT5.

Ah, so this uses the primitives in a way ABC doesn't consider, is that right?

but if you in fact need a MUX8, this is the only way to generate it, at least with an ABC9 flow.

Not sure I understand you here.

@adrianparvino
Copy link
Author

I'm thinking if it's instead better to teach ABC9 that LUT4 + LUT4 + MUX2_LUT5 exists, and is not simply a LUT5.

Ah, so this uses the primitives in a way ABC doesn't consider, is that right?

I haven't tried with ABC, but with ABC9, MUX4s are generated by a LUT6.

but if you in fact need a MUX8, this is the only way to generate it, at least with an ABC9 flow.

Not sure I understand you here.

module _MUX8(input [7:0] x, input [2:0] sel, output reg y);

always @* begin
  y = x[sel];
end

endmodule

Take this code for example, then Yosys apparently generates this by making a MUX4 using 4xLUT1, 1xMUX2_LUT5, 1xMUX2_LUT6, and then it does something with the other inputs, and feeds it to a LUT4 in the end, to create the final result.

From what I understand, signals would propagate through the LUT1, MUX2_LUT5, MUX2_LUT6, LUT4. With this MUX8, it would go through LUT3, MUX2_LUT5, MUX2_LUT6, where the first LUT also implementing a MUX. i.e. it gets to convert it to a mux tree.

The commands I used to generate both are:
read_verilog mux8test.v; proc; opt -fine -full; techmap t:$shiftx; opt -fine -full; muxcover -mux4 -mux8; synth_gowin -json mux8test.json
and
read_verilog mux8test.v; proc; opt -fine -full; synth_gowin -json mux8test.json

This is probably not scientific, but these are the timings according to nextpnr:

Info: Critical path report for cross-domain path '<async>' -> '<async>':
Info: curr total
Info:  0.0  0.0  Source x_IBUF_I_3$iob.O
Info:  3.3  3.3    Net x_IBUF_I_O[4] (46,11) -> (23,10)
Info:                Sink y_OBUF_O_I_MUX2_LUT6_O_I1_MUX2_LUT5_O_I0_LUT3_F_LC.C
Info:  0.8  4.1  Source y_OBUF_O_I_MUX2_LUT6_O_I1_MUX2_LUT5_O_I0_LUT3_F_LC.F
Info:  0.3  4.4    Net y_OBUF_O_I_MUX2_LUT6_O_I1_MUX2_LUT5_O_I0 (23,10) -> (23,10)
Info:                Sink y_OBUF_O_I_MUX2_LUT6_O_I1_MUX2_LUT5_O_LC.I0
Info:                Defined in:
Info:                  /nix/store/i9awr6gc0zw8bspbj4ymrw2ifv4fxab5-yosys-0.34/bin/../share/yosys/gowin/cells_map.v:197.7-197.9
Info:  0.2  4.6  Source y_OBUF_O_I_MUX2_LUT6_O_I1_MUX2_LUT5_O_LC.OF
Info:  0.3  5.0    Net y_OBUF_O_I_MUX2_LUT6_O_I1 (23,10) -> (23,10)
Info:                Sink y_OBUF_O_I_MUX2_LUT6_O_LC.I1
Info:                Defined in:
Info:                  /nix/store/i9awr6gc0zw8bspbj4ymrw2ifv4fxab5-yosys-0.34/bin/../share/yosys/gowin/cells_map.v:201.7-201.11
Info:  0.4  5.3  Source y_OBUF_O_I_MUX2_LUT6_O_LC.OF
Info:  3.3  8.6    Net y_OBUF_O_I (23,10) -> (0,10)
Info:                Sink y_OBUF_O$iob.I
Info: 1.4 ns logic, 7.2 ns routing
Info: Critical path report for cross-domain path '<async>' -> '<async>':
Info: curr total
Info:  0.0  0.0  Source sel_IBUF_I_O_IBUF_O_3$iob.O
Info:  0.9  0.9    Net sel_IBUF_I_O[3] (22,0) -> (23,1)
Info:                Sink sel_IBUF_I_1_O_MUX2_LUT6_O_I0_MUX2_LUT5_O_I0_LUT1_F_LC.A
Info:                Defined in:
Info:                  /nix/store/i9awr6gc0zw8bspbj4ymrw2ifv4fxab5-yosys-0.34/bin/../share/yosys/gowin/cells_map.v:130.20-130.21
Info:  1.0  1.9  Source sel_IBUF_I_1_O_MUX2_LUT6_O_I0_MUX2_LUT5_O_I0_LUT1_F_LC.F
Info:  0.3  2.2    Net sel_IBUF_I_1_O_MUX2_LUT6_O_I0_MUX2_LUT5_O_I0 (23,1) -> (23,1)
Info:                Sink sel_IBUF_I_1_O_MUX2_LUT6_O_I0_MUX2_LUT5_O_LC.I0
Info:                Defined in:
Info:                  /nix/store/i9awr6gc0zw8bspbj4ymrw2ifv4fxab5-yosys-0.34/bin/../share/yosys/gowin/cells_map.v:158.41-158.66
Info:                  /nix/store/i9awr6gc0zw8bspbj4ymrw2ifv4fxab5-yosys-0.34/bin/../share/yosys/gowin/cells_map.v:151.9-151.11
Info:  0.2  2.4  Source sel_IBUF_I_1_O_MUX2_LUT6_O_I0_MUX2_LUT5_O_LC.OF
Info:  0.3  2.8    Net sel_IBUF_I_1_O_MUX2_LUT6_O_I0 (23,1) -> (23,1)
Info:                Sink sel_IBUF_I_1_O_MUX2_LUT6_O_LC.I0
Info:                Defined in:
Info:                  /nix/store/i9awr6gc0zw8bspbj4ymrw2ifv4fxab5-yosys-0.34/bin/../share/yosys/gowin/cells_map.v:157.9-157.11
Info:  0.4  3.1  Source sel_IBUF_I_1_O_MUX2_LUT6_O_LC.OF
Info:  2.3  5.4    Net sel_IBUF_I_1_O[2] (23,1) -> (23,10)
Info:                Sink y_OBUF_O_I_LUT4_F_LC.C
Info:                Defined in:
Info:                  /nix/store/i9awr6gc0zw8bspbj4ymrw2ifv4fxab5-yosys-0.34/bin/../share/yosys/gowin/cells_map.v:130.20-130.21
Info:  0.8  6.3  Source y_OBUF_O_I_LUT4_F_LC.F
Info:  1.8  8.0    Net y_OBUF_O_I (23,10) -> (23,28)
Info:                Sink y_OBUF_O$iob.I
Info: 2.4 ns logic, 5.6 ns routing

As you can see however, while the MUX8 is better at being a MUX, it might still pessimize in larger designs, as that LUT3 is unable to have additional logic, and instead only serves as a MUX2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants