You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Demo is on January 22 January 29, 2019 in our ROBUST-MLSP group meeting.
Outlines
Motivation/Why?
Functionalities
Contributors
Difference between this package and existing packages
Pythonic library
Lazy evaluation
Demo
Easy to use interface
Feature extraction
Data preprocessing (Add multi-threading to audiopipe.py)
HPC
Pytorch compatible dataset
Performance compared to librosa
Optimization (mfcc computation,...etc.)
Roadmap
Contributing
Motivation/What is pyaudlib?
Pyaudlib is a speech processing library in Python with emphasis on deep learning.
Popular speech/audio processing libraries have no deep learning support:
librosa
voicebox
...
Generic deep learning libraries have good image processing support, but not for audio:
PyTorch
TensorFlow
...
pyaudlib (name subject to change) provides a collection of utilities for developing speech-related applications using both signal processing and deep learning.
Functionalities
pyaudlib offers the following high-level features:
Speech signal processing utilities with ready-to-use applications
Feature extraction frontend
Speech enhancement
Speech activity detection
Deep learning architectures for speech processing tasks in PyTorch
SNRNN (and its variant) for speech enhancement*
Attention network + CTC objective for speech recognition*
PyTorch-compatible interface (similar to torchvision) for batch processing
Audlib Demo
Demo is on
January 22January 29, 2019 in our ROBUST-MLSP group meeting.Outlines
Motivation/What is pyaudlib?
Pyaudlib is a speech processing library in Python with emphasis on deep learning.
Popular speech/audio processing libraries have no deep learning support:
Generic deep learning libraries have good image processing support, but not for audio:
pyaudlib (name subject to change) provides a collection of utilities for developing speech-related applications using both signal processing and deep learning.
Functionalities
pyaudlib offers the following high-level features:
Dataset
class specific to speech tasksaudiopipe open -i path/to/audio.wav read logspec plot
*Under development.
Difference between pyaudlib and existing libraries
Correctness
Unit testing is done on all signal processing functions
User inputs are checked for correctness
*This is the official example given by librosa.
Efficiency
We have seen three patterns for pre-processing audio data before feeding them into a NN:
Simplicity
Continuous development (for developers)
*Will be removed in the future.
Roadmap
Top-priority stack (before March):
Mid-priority stack (before April):
components and the Falling edge of the power envelope (SSF)
Other ideas:
Contributing
Current contributors (at least pushed to repo once):
Raymond Xia - [email protected]
Mahmoud Alismail - [email protected]
Shangwu Yao - [email protected]
Joining the developement team, reporting issues, or requesting features are all welcome!
The text was updated successfully, but these errors were encountered: