Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: Document $macc #4316

Merged
merged 3 commits into from
Apr 8, 2024
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
48 changes: 47 additions & 1 deletion docs/source/yosys_internals/formats/cell_library.rst
Original file line number Diff line number Diff line change
Expand Up @@ -619,6 +619,52 @@ Finite state machines

Add a brief description of the ``$fsm`` cell type.

Coarse arithmetics
~~~~~~~~~~~~~~~~~~~~~

The ``$macc`` cell type represents a multiply and accumulate block, for summing any number of negated and unnegated signals and arithmetic products of pairs of signals. Cell port A concatenates pairs of signals to be multiplied together. When the second signal in a pair is zero length, a constant 1 is used instead as the second factor. Cell port B concatenates 1-bit-wide signals to also be summed, such as "carry in" in adders.
widlarizer marked this conversation as resolved.
Show resolved Hide resolved

The cell's ``CONFIG`` parameter determines the layout of cell port ``A``.
In the terms used for this cell, there's mixed meanings for the term "port". To disambiguate:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's call the "multiplier ports" factors or some other term. Overloading the term "port", especially in a document explaining the cell library, will make it hard on the reader.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah "multiplier port" seems to be a "factor pair" rather than a factor

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overloading the term "port", especially in a document explaining the cell library, will make it hard on the reader.

I agree, but the problem is, the term is already overloaded in the internal naming of variables and fields. I was following that. I can refactor as followup or as part of this PR

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can refactor as followup or as part of this PR

If you mean updating the code for a different naming of the internal variables, I would say there's no need. We can afford some naming mismatch of this kind. Especially if we may supersede $macc with $macc_v2 soon, which means we will touch all of that code anyway.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have removed the multiplier port concept from .rst and left it in simlibs since the "port" variables are actually used there

A cell port is for example the A input (it is constructed in C++ as ``cell->setPort(ID::A, ...))``
Multiplier ports are pairs of multiplier inputs ("factors").
If the second signal in such a pair is zero length, no multiplication is necessary, and the first signal is just added to the sum.

In this pseudocode, ``u(foo)`` means an unsigned int that's foo bits long.
The CONFIG parameter carries the following information:
.. code-block::
:force:
struct CONFIG {
u4 num_bits;
struct port_field {
bool is_signed;
bool is_subtract;
u(num_bits) factor1_len;
u(num_bits) factor2_len;
}[num_ports];
};
Copy link
Member

@whitequark whitequark Apr 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@widlarizer Thank you for documenting this!

@ everyone-else The semantics for this is completely insane, right? I'm not alone in wanting to see someone widlarize it? Why does it have dependently typed bitfields? The biggest factor[12]_len it can express is u16. Why not just use u16 at all times, so that you do not need dependently typed bitfields?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I consider it an extreme case of data oriented design, probably to the extent of premature optimization. The complexity makes it probably liable for bugs if, say, somebody wrote HLS using Yosys that tries to generate $macc outside of yosys itself. Special casing single bit arguments, special casing zero length second multiplication arguments instead of using constants - I'm interested what motivated this

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The biggest factor[12]_len it can express is u16. Why not just use u16 at all times, so that you do not need dependently typed bitfields?

It's a bit worse, the maximum is u15. We could restrict the definition of $macc going forward, we could say that they are only valid with 15 for num_bits.

It's more complex than it needs to be, but I am not convinced that alone justifies changing it right now. If we had other changes we would want to make in the cell's definition, then sure, let's go for $macc_v2 that's as simple as we can make it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I change my position. We can make this much simpler. We can express all of this through a sequence of products on signals, where those signals can be constant as need be. We can afford to make all of the operands the same width, though that may be going too far. I am in favor of defining macc_v2.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we externally bound to keep macc unchanged? Replacing all uses of current macc with simpler macc seems easy

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we externally bound to keep macc unchanged?

I can't imagine anyone outside of YosysHQ is using this mess...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I put "defining $macc_v2" as an agenda item for the next dev JF.


The A cell port carries the following information:
.. code-block::
:force:
struct A {
u(CONFIG.port_field[0].factor1_len) port0factor1;
u(CONFIG.port_field[0].factor2_len) port0factor2;
u(CONFIG.port_field[1].factor1_len) port1factor1;
u(CONFIG.port_field[1].factor2_len) port1factor2;
...
};

No factor1 may have a zero length.
A factor2 having a zero length implies factor2 is replaced with a constant 1.

Additionally, B is an array of 1-bit-wide unsigned integers to also be summed up.
Finally, we have:
.. code-block::
:force:
KrystalDelusion marked this conversation as resolved.
Show resolved Hide resolved
Y = port0factor1 * port0factor2 + port1factor1 * port1factor2 + ...
* B[0] + B[1] + ...
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! A formula

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I fixed it, even if +- is a confusing operator


Specify rules
~~~~~~~~~~~~~

Expand Down Expand Up @@ -1152,4 +1198,4 @@ file via ABC using the abc pass.

.. todo:: Add information about ``$lut`` and ``$sop`` cells.

.. todo:: Add information about ``$alu``, ``$macc``, ``$fa``, and ``$lcu`` cells.
.. todo:: Add information about ``$alu``, ``$fa``, and ``$lcu`` cells.
51 changes: 47 additions & 4 deletions techlibs/common/simlib.v
Original file line number Diff line number Diff line change
Expand Up @@ -902,18 +902,29 @@ endgenerate
endmodule

// --------------------------------------------------------

// |---v---|---v---|---v---|---v---|---v---|---v---|---v---|---v---|---v---|---v---|
//-
//- $macc (A, B, Y)
//-
//- Multiply and accumulate.
//- A building block for summing any number of negated and unnegated signals and arithmetic products of pairs of signals. Cell port A concatenates pairs of signals to be multiplied together. When the second signal in a pair is zero length, a constant 1 is used instead as the second factor. Cell port B concatenates 1-bit-wide signals to also be summed, such as "carry in" in adders.
KrystalDelusion marked this conversation as resolved.
Show resolved Hide resolved
//- Typically created by the `alumacc` pass, which transforms $add and $mul into $macc cells.
module \$macc (A, B, Y);

parameter A_WIDTH = 0;
parameter B_WIDTH = 0;
parameter Y_WIDTH = 0;
// CONFIG determines the layout of A, as explained below
parameter CONFIG = 4'b0000;
parameter CONFIG_WIDTH = 4;

input [A_WIDTH-1:0] A;
input [B_WIDTH-1:0] B;
output reg [Y_WIDTH-1:0] Y;
// In the terms used for this cell, there's mixed meanings for the term "port". To disambiguate:
// A cell port is for example the A input (it is constructed in C++ as cell->setPort(ID::A, ...))
// Multiplier ports are pairs of multiplier inputs ("factors").
// If the second signal in such a pair is zero length, no multiplication is necessary, and the first signal is just added to the sum.
input [A_WIDTH-1:0] A; // Cell port A is the concatenation of all arithmetic ports
input [B_WIDTH-1:0] B; // Cell port B is the concatenation of single-bit unsigned signals to be also added to the sum
output reg [Y_WIDTH-1:0] Y; // Output sum

// Xilinx XSIM does not like $clog2() below..
function integer my_clog2;
Expand All @@ -929,10 +940,42 @@ function integer my_clog2;
end
endfunction

// Bits that a factor's length field in CONFIG per factor in cell port A
localparam integer num_bits = CONFIG[3:0] > 0 ? CONFIG[3:0] : 1;
// Number of multiplier ports
localparam integer num_ports = (CONFIG_WIDTH-4) / (2 + 2*num_bits);
// Minium bit width of an induction variable to iterate over all bits of cell port A
localparam integer num_abits = my_clog2(A_WIDTH) > 0 ? my_clog2(A_WIDTH) : 1;

// In this pseudocode, u(foo) means an unsigned int that's foo bits long.
// The CONFIG parameter carries the following information:
// struct CONFIG {
// u4 num_bits;
// struct port_field {
// bool is_signed;
// bool is_subtract;
// u(num_bits) factor1_len;
// u(num_bits) factor2_len;
// }[num_ports];
// };

// The A cell port carries the following information:
// struct A {
// u(CONFIG.port_field[0].factor1_len) port0factor1;
// u(CONFIG.port_field[0].factor2_len) port0factor2;
// u(CONFIG.port_field[1].factor1_len) port1factor1;
// u(CONFIG.port_field[1].factor2_len) port1factor2;
// ...
// };
// and log(sizeof(A)) is num_abits.
// No factor1 may have a zero length.
// A factor2 having a zero length implies factor2 is replaced with a constant 1.

// Additionally, B is an array of 1-bit-wide unsigned integers to also be summed up.
// Finally, we have:
// Y = port0factor1 * port0factor2 + port1factor1 * port1factor2 + ...
// * B[0] + B[1] + ...

function [2*num_ports*num_abits-1:0] get_port_offsets;
input [CONFIG_WIDTH-1:0] cfg;
integer i, cursor;
Expand Down
Loading