v1.12
Features
- Support for backend operators implemented using PPL.
- TPUv7-runtime CModel integrated with TPU-MLIR for BM1690 model CModel inference.
- Optimized inference speed for BM1690 Stable Diffusion 3.0 at 512 resolution to 0.72 img/s (Mac utilization: 41.9%).
- Support for training graph compilation of ResNet50-v1 through FxGraphConverter.
Bug Fixes
- Performance: Fixed the issue of performance degradation in SegNet.
- Functionality: Resolved the compilation comparison issue for BM1688 DeppLabv3P.
Known Issues
- Performance: Slight performance degradation observed in BM1690 YOLOv5-6 with 4 batch INT8 on eight cores.