diff --git a/README.md b/README.md index d50f27d9e..796200ac9 100644 --- a/README.md +++ b/README.md @@ -3,11 +3,10 @@ ([简体中文](./README_zh.md)|English) # FunASR: A Fundamental End-to-End Speech Recognition Toolkit -

- - - -

+ + +[![PyPI](https://img.shields.io/pypi/v/funasr)](https://pypi.org/project/funasr/) + FunASR hopes to build a bridge between academic research and industrial applications on speech recognition. By supporting the training & finetuning of the industrial-grade speech recognition model, researchers and developers can conduct research and production of speech recognition models more conveniently, and promote the development of speech recognition ecology. ASR for Fun! @@ -28,6 +27,7 @@ ## What's new: +- 2024/01/30:funasr-1.0 has been released ([docs](https://github.com/alibaba-damo-academy/FunASR/discussions/1319)) - 2024/01/25: Offline File Transcription Service 4.2, Offline File Transcription Service of English 1.3 released,optimized the VAD (Voice Activity Detection) data processing method, significantly reducing peak memory usage, memory leak optimization; Real-time Transcription Service 1.7 released,optimizatized the client-side;([docs](runtime/readme.md)) - 2024/01/09: The Funasr SDK for Windows version 2.0 has been released, featuring support for The offline file transcription service (CPU) of Mandarin 4.1, The offline file transcription service (CPU) of English 1.2, The real-time transcription service (CPU) of Mandarin 1.6. For more details, please refer to the official documentation or release notes([FunASR-Runtime-Windows](https://www.modelscope.cn/models/damo/funasr-runtime-win-cpu-x64/summary)) - 2024/01/03: File Transcription Service 4.0 released, Added support for 8k models, optimized timestamp mismatch issues and added sentence-level timestamps, improved the effectiveness of English word FST hotwords, supported automated configuration of thread parameters, and fixed known crash issues as well as memory leak problems, refer to ([docs](runtime/readme.md#file-transcription-service-mandarin-cpu)). @@ -48,7 +48,19 @@ ## Installation -Please ref to [installation docs](https://alibaba-damo-academy.github.io/FunASR/en/installation/installation.html) +```shell +pip3 install -U funasr +``` +Or install from source code +``` sh +git clone https://github.com/alibaba/FunASR.git && cd FunASR +pip3 install -e ./ +``` +Install modelscope for the pretrained models (Optional) + +```shell +pip3 install -U modelscope +``` ## Model Zoo FunASR has open-sourced a large number of pre-trained models on industrial data. You are free to use, copy, modify, and share FunASR models under the [Model License Agreement](./MODEL_LICENSE). Below are some representative models, for more models please refer to the [Model Zoo](). diff --git a/README_zh.md b/README_zh.md index 61b725fe6..9d7c8d699 100644 --- a/README_zh.md +++ b/README_zh.md @@ -3,11 +3,9 @@ (简体中文|[English](./README.md)) # FunASR: A Fundamental End-to-End Speech Recognition Toolkit -

- - - -

+ +[![PyPI](https://img.shields.io/pypi/v/funasr)](https://pypi.org/project/funasr/) + FunASR希望在语音识别的学术研究和工业应用之间架起一座桥梁。通过发布工业级语音识别模型的训练和微调,研究人员和开发人员可以更方便地进行语音识别模型的研究和生产,并推动语音识别生态的发展。让语音识别更有趣! @@ -31,6 +29,7 @@ FunASR希望在语音识别的学术研究和工业应用之间架起一座桥 ## 最新动态 +- 2024/01/30:funasr-1.0发布,更新说明[文档](https://github.com/alibaba-damo-academy/FunASR/discussions/1319) - 2024/01/25: 中文离线文件转写服务 4.2、英文离线文件转写服务 1.3,优化vad数据处理方式,大幅降低峰值内存占用,内存泄漏优化;中文实时语音听写服务 1.7 发布,客户端优化;详细信息参阅([部署文档](runtime/readme_cn.md)) - 2024/01/09: funasr社区软件包windows 2.0版本发布,支持软件包中文离线文件转写4.1、英文离线文件转写1.2、中文实时听写服务1.6的最新功能,详细信息参阅([FunASR社区软件包windows版本](https://www.modelscope.cn/models/damo/funasr-runtime-win-cpu-x64/summary)) - 2024/01/03: 中文离线文件转写服务 4.0 发布,新增支持8k模型、优化时间戳不匹配问题及增加句子级别时间戳、优化英文单词fst热词效果、支持自动化配置线程参数,同时修复已知的crash问题及内存泄漏问题,详细信息参阅([部署文档](runtime/readme_cn.md#中文离线文件转写服务cpu版本)) @@ -49,7 +48,20 @@ FunASR希望在语音识别的学术研究和工业应用之间架起一座桥 ## 安装教程 -FunASR安装教程请阅读([Installation](https://alibaba-damo-academy.github.io/FunASR/en/installation/installation.html)) + +```shell +pip3 install -U funasr +``` +或者从源代码安装 +``` sh +git clone https://github.com/alibaba/FunASR.git && cd FunASR +pip3 install -e ./ +``` +如果需要使用工业预训练模型,安装modelscope(可选) + +```shell +pip3 install -U modelscope +``` ## 模型仓库 diff --git a/funasr/version.txt b/funasr/version.txt index 21e8796a0..ee90284c2 100644 --- a/funasr/version.txt +++ b/funasr/version.txt @@ -1 +1 @@ -1.0.3 +1.0.4 diff --git a/funasr/quick_start.md b/runtime/quick_start.md similarity index 83% rename from funasr/quick_start.md rename to runtime/quick_start.md index 44ce5c4a0..4f251c751 100644 --- a/funasr/quick_start.md +++ b/runtime/quick_start.md @@ -132,33 +132,3 @@ python3 funasr_wss_client.py --host "127.0.0.1" --port 10095 --mode offline --au For more examples, please refer to [docs](https://github.com/alibaba-damo-academy/FunASR/blob/main/runtime/docs/SDK_advanced_guide_offline.md) - -## Industrial Model Egs - -If you want to use the pre-trained industrial models in ModelScope for inference or fine-tuning training, you can refer to the following command: - -```python -from modelscope.pipelines import pipeline -from modelscope.utils.constant import Tasks - -inference_pipeline = pipeline( - task=Tasks.auto_speech_recognition, - model='damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch', -) - -rec_result = inference_pipeline(audio_in='https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/asr_example_zh.wav') -print(rec_result) -# {'text': '欢迎大家来体验达摩院推出的语音识别模型'} -``` - -More examples could be found in [docs](https://alibaba-damo-academy.github.io/FunASR/en/modelscope_pipeline/quick_start.html) - -## Academic model egs - -If you want to train from scratch, usually for academic models, you can start training and inference with the following command: - -```shell -cd egs/aishell/paraformer -. ./run.sh --CUDA_VISIBLE_DEVICES="0,1" --gpu_num=2 -``` -More examples could be found in [docs](https://alibaba-damo-academy.github.io/FunASR/en/modelscope_pipeline/quick_start.html) diff --git a/funasr/quick_start_zh.md b/runtime/quick_start_zh.md similarity index 83% rename from funasr/quick_start_zh.md rename to runtime/quick_start_zh.md index 2fe756aaf..c1f645374 100644 --- a/funasr/quick_start_zh.md +++ b/runtime/quick_start_zh.md @@ -130,35 +130,3 @@ python3 funasr_wss_client.py --host "127.0.0.1" --port 10095 --mode offline --au -### 工业模型egs - -如果您希望使用ModelScope中预训练好的工业模型,进行推理或者微调训练,您可以参考下面指令: - - -```python -from modelscope.pipelines import pipeline -from modelscope.utils.constant import Tasks - -inference_pipeline = pipeline( - task=Tasks.auto_speech_recognition, - model='damo/speech_paraformer-large-vad-punc_asr_nat-zh-cn-16k-common-vocab8404-pytorch', -) - -rec_result = inference_pipeline(audio_in='https://isv-data.oss-cn-hangzhou.aliyuncs.com/ics/MaaS/ASR/test_audio/asr_example_zh.wav') -print(rec_result) -# {'text': '欢迎大家来体验达摩院推出的语音识别模型'} -``` - -更多例子可以参考([点击此处](https://alibaba-damo-academy.github.io/FunASR/en/modelscope_pipeline/quick_start.html)) - - -### 学术模型egs - -如果您希望从头开始训练,通常为学术模型,您可以通过下面的指令启动训练与推理: - -```shell -cd egs/aishell/paraformer -. ./run.sh --CUDA_VISIBLE_DEVICES="0,1" --gpu_num=2 -``` - -更多例子可以参考([点击此处](https://alibaba-damo-academy.github.io/FunASR/en/academic_recipe/asr_recipe.html))