Skip to content

v0.4.4

Pre-release
Pre-release
Compare
Choose a tag to compare
@diegofiori diegofiori released this 20 Oct 12:09
· 654 commits to main since this release
a43b1e9

nebullvm 0.4.4 Release Notes

This release of Nebullvm provides new optimizers and various improvements in code stability.

New Features

  • Update notebooks with new api.
  • Improve test coverage.
  • Add Intel Neural compressor pruning and quantization.
  • The computation of the latency of the models now uses all the data and not only the first sample.
  • Dynamic shape of openvino has been updated with the new method available from version 2
  • Now the optimized model is discarted if the result is different from the original model (metric_drop_ths=0)

Bug fixed

  • Fix an issue during onnx quantization, now it's much faster than before.
  • Fix a tensor RT bug in static quantization with onnx interface.
  • Fixes and improvements on the torchscript compiler: now it supports also trace and torch.fx for tracing the model.
  • Fix a bug on macos related to ONNX and int8 quantization.
  • Fix a bug on sparseml that prevented it from working on colab.
  • Bug-fixes on the deepsparse compiler.
  • Fixes and improvements on the onnx internal model handling.
  • Fix an issue on tensorflow backend.
  • Fixes on torch and onnx tensorrt with transformers.
  • Fix a bug on tensor rt static quantization when using a new version of polygraphy
  • Fix a bug on huggingface when passing the tokenizer to the optimize_model function
  • Fix a bug when using quantization with a few data

Contributors