-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathapi doc
49 lines (34 loc) · 1.99 KB
/
api doc
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
# This file will be some notes on the api func in details.
# ark defaults to bytes,and 'ark ,t' refers to text
# Create MFCC feature files or stream.
# Usage: compute-mfcc-feats [options...] <wav-rspecifier> <feats-wspecifier>
compute-mfcc-feats --verbose=2 --config=$mfcc.conf "scp:echo utterance-id1 /home/zhang/Projects/AI2/VoiceAppFile/VoiceAns/2/0311203006_PAPER-934712-QT-742440_QT-742440ans.wav|" ark[,t]:$mfcc.feat.file (or -)
# $1 is the mfcc-features-file or stream('-'), '-' means output data stream in the shell, the stream is bytes by default
copy-feats ark:$1 ark:-
# $1 is the mfcc-features-file or stream('-'), '-' means output data stream in the shell, the stream is text
copy-feats ark:$1 ark,t:-
# $2 is a file to save byte stream data.('ark,t' means saving text stream)
copy-feats ark:$1 ark:$2
# copy mfcc-feats-file to $ark(bytes or text) and $scp
copy-feats --compress=true ark:$1 ark,[t,]scp:$ark,$scp
compute-cmvn-stats scp:$f2.scp ark,scp:$test-cmvn.ark,$test-cmvn.scp
apply-cmvn scp:./test-cmvn.scp scp:./QT-742440ans.wav.scp ark,t:-
'''
./online2-wav-nnet3-latgen-faster --do-endpointing=false --online=false --feature-type=fbank
--fbank-config=../../egs/cvte/s5/conf/fbank.conf
--max-active=7000 --beam=15.0 --lattice-beam=6.0 --acoustic-scale=1.0
--word-symbol-table=../../egs/cvte/s5/exp/chain/tdnn/graph/words.txt
../../egs/cvte/s5/exp/chain/tdnn/final.mdl
../../egs/cvte/s5/exp/chain/tdnn/graph/HCLG.fst
'ark:echo utter1 utter1|' 'scp:echo utter1 ../../egs/cvte/s5/data/wav/00030/2017_03_07_16.57.22_1175.wav|' ark:/dev/null
online2-wav-nnet3-latgen-faster:是调用来进行识别的命令
后面跟着的那些是参数设置和输入的文件和模型
words.txt词表
final.mdl训练好的声学模型
HCLG.fst
H:HMM,输出符表示上下文相关的音素
C:表上下文关系,输出是音素,输入是上下文相关的N个音素组成的窗
L:lexicon发音词典,输出是词,输入是音素
G:语法
最后那个WAV是你要识别的语音。
'''