Synapse: Systolic CNN Accelerator’s Mapper-Simulator Environment

Systolic arrays are one of the most popular compute substrates within DL accelerators today, as they provide extremely high efficiency for running dense matrix multiplications by re-using operands through local data shifts. One such effort by RISE lab at IIT Madras is ShaktiMAAN, an open-source DNN accelerator for inference on edge devices based on systolic arrays.

The complexity of this accelerator poses a variety of challenges in:

Hardware verification
Bottleneck analysis using performance modelling
Design space trade-offs
Efficient mapping strategy
Compiler optimizations

To tackle these challenges, I built Synapse (SYstolic CNN Accelerator’s MaPper-Simulator Environment): a versatile python based mapper-simulator environment. This work, done under the guidance of Prof. Pratyush Kumar, was submitted as my Bachelor's thesis at IIT Madras.

Key Contributions:

Mapper that generates instruction trace given any workload, knob values for a targeted architecture.
Functional simulator cost model for ShaktiMAAN.
An Reinforcement Learning agent that interacts with the mapper-simulator environment to search through the design space to find optimal hardware (array, buffer size), software (network folds, loop order) co-design knobs.

Dependencies

Installing DRAMSim2
Installing SWIG
OpenAI Gym 0.7+
PyTorch 1.11+

Using Synapse

Mapper

Instructions for SHAKTIMAAN and simulator can be generated using systolic/mapper.py, which takes care of all dependency resolutions between different instructions.
As done in the cost-model file model.py, instantiate NetworkMapping object and pass DL network (topologies/1.csv), systolic array (configs/1.cfg) configurations to generate instructions.

Simulator

An event-driven, analytical, data-flow accurate systolic/simulator.py tries to model SHAKTIMAAN. It uses the instructions generated by the mapper and runs on an event-driven fashion using timestamps.
It also calculates instruction-wise and global utilization efficiency.
It finally verifies if the output it generates matches with the actual expected MatMul output.
As done in the cost-model file model.py, instantiate Simulator object and pass systolic array, instructions generated (from mapper or otherwise with proper ISA) to simulate. It outputs layer-wise, instruction-wise, global statistics in outputs/ directory.

Reinforcement Learning Agent

A single Jupyter notebook ppo_main.py.ipynb, defines the RL agent, update rule, learning algorithm (PPO) and trains, evaluates the model to find optimal knobs like buffer-size, loop-order, etc. It can be easily ported to run on public-cloud like GCP, AWS, etc. or google-colab.

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
DRAMSim2		DRAMSim2
configs		configs
images		images
outputs/out1		outputs/out1
systolic		systolic
topologies		topologies
DRAM.py		DRAM.py
LICENSE		LICENSE
README.md		README.md
_DRAM.so		_DRAM.so
env.py		env.py
model.py		model.py
ppo_main.py.ipynb		ppo_main.py.ipynb
ppo_test.py.ipynb		ppo_test.py.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Synapse: Systolic CNN Accelerator’s Mapper-Simulator Environment

Key Contributions:

Dependencies

Using Synapse

Mapper

Simulator

Reinforcement Learning Agent

Additional resources

About

Releases

Packages

Languages

License

sundar7D0/synapse

Folders and files

Latest commit

History

Repository files navigation

Synapse: Systolic CNN Accelerator’s Mapper-Simulator Environment

Key Contributions:

Dependencies

Using Synapse

Mapper

Simulator

Reinforcement Learning Agent

Additional resources

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages