Name		Name	Last commit message	Last commit date
parent directory ..
Makefile		Makefile
Makefile.chess		Makefile.chess
README.md		README.md
run_makefile.lit		run_makefile.lit
run_makefile_1.lit		run_makefile_1.lit
run_makefile_alt.lit		run_makefile_alt.lit
run_makefile_chess.lit		run_makefile_chess.lit
run_makefile_i8.lit		run_makefile_i8.lit
run_makefile_iron.lit		run_makefile_iron.lit
single_core.py		single_core.py
single_core_alt.py		single_core_alt.py
single_core_iron.py		single_core_iron.py
test.cpp		test.cpp

README.md

Matrix Multiplication - Single Core Design

In this design, a single AI Engine compute core performs a matrix-matrix-multiplication. By default, the matrices are int16 data type for the input and int32 data type for the output, and the dimensions are set (by default) to M×K×N = 256×256×256. The kernel operates on chunks of 64×32×64 (m×k×n), so it is invoked multiple times to complete the full result.

This design is a simplification of the whole-array design. Instead of utilizing all available AI Engine compute cores in parallel, this design performs all computation on a single core. To understand this design better, please refer to the discussion of the whole-array design and the differences outlined below.

Differences from the Whole-Array Design

This design supports tracing; See below.
Only a single core performs computations. As such, we only need a single ObjectFIFO for each of the transfers between the levels (shim → memory, memory → compute, and back). These ObjectFIFOs are named inA, inB, outC and memA, memB and memC, respectively.

Notes on the `single_core_alt.py` Implementation

As in the whole-array design, the single_core.py file describes the data movement of the design. This single core example also comes with an alternative implementation, which can be found in single_core_alt.py. If you specify use_alt=1 as an environment variable at compile time, this alternative implementation will be used in place of single_core.py.

Functionally, single_core.py and single_core_alt.py are intended to be identical. However, single_core_alt.py is implemented using a new syntax for runtime buffer descriptor configuration on the shim. Specifically, single_core_alt.py uses the aiex.dma_configure_task_for, aiex.dma_start_task and aiex.dma_await_task operations instead of aiex.dma_memcpy_nd.

Notes on the `single_core_iron.py` Implementation

There is an implementation of this design found in single_core_iron.py using a higher-level version of IRON. If you specify use_iron=1 as an environment variable at compile time, this alternative implementation will be used in place of single_core.py.

Functionally, this design is intended to be identical to the other two. However, single_core_iron.py currently does not support tracing.

Building and Running the Design

You need C++23 for bfloat16_t support. It can be found in g++-13: https://lindevs.com/install-g-on-ubuntu

To compile and run design:

make
make single_core.exe
make run

To compile and run the alternative design:

env use_alt=1 make
env use_alt=1 make single_core.exe
env use_alt=1 make run

To compile and run the higher-level IRON design:

env use_iron=1 make
env use_iron=1 make single_core.exe
env use_iron=1 make run

Tracing

To get tracing output, set enable_tracing=True in single_core.py and ENABLE_TRACING=true in test.cpp. Tracing is also supported in single_core_alt.py.

By default, traces will be written out to trace.txt; another output file can be specified using the --trace (or -t) flag to the host code.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

single_core

single_core

README.md

Matrix Multiplication - Single Core Design

Differences from the Whole-Array Design

Notes on the `single_core_alt.py` Implementation

Notes on the `single_core_iron.py` Implementation

Building and Running the Design

Tracing

Files

single_core

Directory actions

More options

Directory actions

More options

Latest commit

History

single_core

Folders and files

parent directory

README.md

Matrix Multiplication - Single Core Design

Differences from the Whole-Array Design

Notes on the single_core_alt.py Implementation

Notes on the single_core_iron.py Implementation

Building and Running the Design

Tracing

Notes on the `single_core_alt.py` Implementation

Notes on the `single_core_iron.py` Implementation