Skip to content

Latest commit

 

History

History
75 lines (53 loc) · 2.54 KB

CHANGELOG.md

File metadata and controls

75 lines (53 loc) · 2.54 KB

Changelog

Version 2.1.1 (2022-10-27)

  • BREAKING(pipeline): rewrite speaker diarization pipeline
  • feat(pipeline): add option to optimize for DER variant
  • feat(clustering): add support for NeMo speaker embedding
  • feat(clustering): add FINCH clustering
  • feat(clustering): add min_cluster_size hparams to AgglomerativeClustering
  • feat(hub): add support for private/gated models
  • setup(hub): switch to latest hugginface_hub API
  • fix(pipeline): fix support for missing reference in Resegmentation pipeline
  • fix(clustering) fix corner case where HMM.fit finds too little states

Version 2.0.1 (2022-07-20)

  • BREAKING: complete rewrite
  • feat: much better performance
  • feat: Python-first API
  • feat: pretrained pipelines (and models) on Huggingface model hub
  • feat: multi-GPU training with pytorch-lightning
  • feat: data augmentation with torch-audiomentations
  • feat: Prodigy recipe for model-assisted audio annotation

Version 1.1.2 (2021-01-28)

  • fix: make sure master branch is used to load pretrained models (#599)

Version 1.1 (2020-11-08)

  • last release before complete rewriting

Version 1.0.1 (2018--07-19)

  • fix: fix regression in Precomputed.call (#110, #105)

Version 1.0 (2018-07-03)

  • chore: switch from keras to pytorch (with tensorboard support)
  • improve: faster & better traning (AutoLR, advanced learning rate schedulers, improved batch generators)
  • feat: add tunable speaker diarization pipeline (with its own tutorial)
  • chore: drop support for Python 2 (use Python 3.6 or later)

Version 0.3.1 (2017-07-06)

  • feat: add python 3 support
  • chore: rewrite neural speaker embedding using autograd
  • feat: add new embedding architectures
  • feat: add new embedding losses
  • chore: switch to Keras 2
  • doc: add tutorial for (MFCC) feature extraction
  • doc: add tutorial for (LSTM-based) speech activity detection
  • doc: add tutorial for (LSTM-based) speaker change detection
  • doc: add tutorial for (TristouNet) neural speaker embedding

Version 0.2.1 (2017-03-28)

  • feat: add LSTM-based speech activity detection
  • feat: add LSTM-based speaker change detection
  • improve: refactor LSTM-based speaker embedding
  • feat: add librosa basic support
  • feat: add SMORMS3 optimizer

Version 0.1.4 (2016-09-26)

  • feat: add 'covariance_type' option to BIC segmentation

Version 0.1.3 (2016-09-23)

  • chore: rename sequence generator in preparation of the release of TristouNet reproducible research package.

Version 0.1.2 (2016-09-22)

  • first public version