Skip to content

Releases: sophgo/tpu-mlir

v1.13

01 Dec 04:53
Compare
Choose a tag to compare
add a16 matmul multi_core

Change-Id: I10a9097ee52e324555f4a505ce18d7fe9b665803

v1.13-beta.0

22 Nov 13:12
Compare
Choose a tag to compare
[doc] layergroup opt intro

Change-Id: I0797b73e4d020e9556da29d1c1a743b8c80a83ad

v1.12

05 Nov 07:30
Compare
Choose a tag to compare

Features

  • Support for backend operators implemented using PPL.
  • TPUv7-runtime CModel integrated with TPU-MLIR for BM1690 model CModel inference.
  • Optimized inference speed for BM1690 Stable Diffusion 3.0 at 512 resolution to 0.72 img/s (Mac utilization: 41.9%).
  • Support for training graph compilation of ResNet50-v1 through FxGraphConverter.

Bug Fixes

  • Performance: Fixed the issue of performance degradation in SegNet.
  • Functionality: Resolved the compilation comparison issue for BM1688 DeppLabv3P.

Known Issues

  • Performance: Slight performance degradation observed in BM1690 YOLOv5-6 with 4 batch INT8 on eight cores.

v1.12-beta.0

25 Oct 10:18
Compare
Choose a tag to compare
combine slice and concate to new Rope ConcatToRope

Change-Id: Ib15b12fe97117b96c6fe7267c96c3f714aac6ec4

v1.11

27 Sep 05:30
Compare
Choose a tag to compare
[python] distinguish data path model-zoo from regression

Change-Id: I98fa0df1524f0b38d91cda02ab5d49876f7caee8
(cherry picked from commit fa082d0b29df8a82af77839df86349aabab86949)

v1.11-beta.0

18 Sep 09:02
Compare
Choose a tag to compare
[soc_dump] add doc

Change-Id: Icaf313113415a9bf0ad9c75abdcb609d661c815b

TPU-MLIR v1.10 Release

15 Aug 05:02
Compare
Choose a tag to compare

Release Note

Enhancements:

  • Added CUDA support for various operations like conv2d, MatMul, dwconv, pool2d, and more.
  • Improved performance for operations like MeanStdScale and softmax.
  • Enhanced multi-core batch mm and added support for bm168x with CUDA.
  • Refined CUDA code style and adjusted interfaces for various operations.

Bug Fixes:

  • Fixed issues with matmul, calibration failures, conv pad problems, and various performance problems.
  • Addressed bugs in model transformations, calibration, and various pattern issues.
  • Resolved bugs in different model backends like ssd, vit, detr, and yolov5.

New Features:

  • Added support for new models like resnet50, mobilenet_v2, shufflenet_v2, and yolox_s/alphapose_res50.
  • Introduced new operations like RequantIntAxisOp and Depth2Space with CUDA support.
  • Implemented new functionalities for better model inference and compilation.

Documentation Updates:

  • Updated weight.md, calibration sections, and user interface details.
  • Improved documentation for quick start, developer manual, and various tpulang interfaces.
  • Enhanced documentation for model transformation parameters and tensor data arrangements.

Miscellaneous:

  • Added new npz tools, modelzoo regression, and support for bmodel encryption.
  • Fixed issues with various model performance, shape inference, and CUDA backend optimizations.
  • Revived performance for models like yolov5s-6, bm1690 swin multicore, and more.

TPU-MLIR v1.9 Release

15 Jul 14:40
Compare
Choose a tag to compare

Release Note

Enhancements:

  • Implemented output order preservation in converters like ONNX, Caffe, Torch, and TFLite.
  • Added support for resnet50-v2 bm1690 f8 regression.
  • Improved ILP group mlir file sequences for resnet50 training.
  • Updated chip libraries and performance AI for A2 profiling.
  • Added a new dump mode "COMB" and refined abs/relu conversions.

Bug Fixes:

  • Fixed issues with preprocess when source layout differs from target layout.
  • Addressed bugs in various operations like softmax, concat, and weight reorder in conv2d.
  • Resolved bugs in model training, model transformation, and various pattern issues.
  • Fixed bugs related to CUDA inference, matmul with bias, and multi-output calibration.

New Features:

  • Added support for multi-graph in TPULang.
  • Introduced new options in TPULang for inference and model deployment.
  • Implemented various optimizations and enhancements for dynamic operations and model transformations.

Documentation Updates:

  • Refined documentation for quick start quantization and user interface sections.
  • Updated backend information, docker image download methods, and model deployment details in the documentation.

Miscellaneous:

  • Improved performance for various models like vit, yolov5s, and bm1690.
  • Introduced new functionalities like embedding multi-device slice and groupnorm train operations.
  • Added support for adaptive_avgpool inference and multiple Einsum modes.

TPU-MLIR v1.8.1

12 Jul 09:27
Compare
Choose a tag to compare

Full Changelog: v1.8...v1.8.1

TPU-MLIR v1.8 Release

29 May 11:15
Compare
Choose a tag to compare

Highlights:

  • Enhancements:

    • Added support for dynamic shape inference in various operations.
    • Optimized core operations for better performance on specific models.
    • Improved backend support for multiple models like BM1684X, BM1688, BM1690, SG2380, etc.
    • Introduced new operations and patterns for more efficient model processing.
    • Updated documentation for better clarity and user guidance.
  • Bug Fixes:

    • Resolved issues related to input/output handling, kernel configurations, and model-specific bugs.
    • Fixed bugs in dynamic compilation, core parallel processing, and various backend operations.
    • Addressed errors in specific model post-processing steps like YOLOv5, EfficientNet, etc.
  • Performance Improvements:

    • Optimized cycle calculations for multi-core models.
    • Enhanced bandwidth usage statistics for better resource management.
    • Accelerated compilation processes for training models using a new layer-group scheme.
  • New Features:

    • Introduced new operations like attention quant block, prelu op, and various dynamic compile features.
    • Added support for additional operations, weight location, and dynamic compile enhancements.

Documentation Updates:

  • Updated developer manuals, quick start guides, and model-specific documentation for better understanding.

Miscellaneous:

  • Streamlined workflows for faster commit checks and improved debugging processes.
  • Added new test cases for regression testing and script-based model evaluations.
  • Fine-tuned backend operations for improved model performance and accuracy.