You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This issue focuses on debugging airbus_attention_vtlp_CTC.py that first appears in commit 1857440@ShangwuYao. Leave the major changes you did in the next section, and anything else to be done in the to-do section.
Moved airbus_attention_vtlp_CTC.py from audlib.nn to egs/asr.
Moved common DNN functions to audlib.nn.nn. Check "Refactoring nn Module" issue for more info.
Refactored data-related functions to egs/asr/dataset.py. The script only needs successful installation of audlib and a downloaded WSJ dataset (You can download from LDC's catalog using @raymondxyy's account) to work properly. Make sure this runs fine before you proceed.
Refactored transforms and collate functions to egs/asr/transforms.py.
Removed WSJ class that loads all training/validation/test data in one shot. Replaced it with WSJ0 and ASRWSJ0 (WSJ1 is also available) in audlib.data.wsj. See egs/asr/dataset.py.
Removed this code block for vocabulary handling in main:
The string-to-vocab-to-label-to-vocab-to-string pipeline can be done using CharacterMap or PhonemeMap in audlib.asr.util. See dataset.py for an example of how this is done.
Removed myDataset because it only prepends or appends special characters to training transcripts. This step is now done as part of transform in transforms.FinalTransform.
CUDA availability is checked in the beginning of main functions and tensors get transferred to device as part of the new collate function:
All code that transfers tensors to CUDA that are not in the main functions are therefore commented out. It doesn't make much sense for data to keep switching device after they enter NNs, unless there are packages that require processing on a particular device. This will be part of the TODOs.
Vocabulary histogram is now available as part of the dataset, so no need to do it manually in main. See VOCAB_HIST in dataset.py.
On CPU, error is thrown in one of built-in RNN functions:
File "/home/xyy/anaconda3/lib/python3.6/site-packages/torch/nn/modules/rnn.py", line 182, in forward
self.num_layers, self.dropout, self.training, self.bidirectional)
RuntimeError: got an incorrect number of RNN parameters
Should be an easy fix if it's just different interface.
On GPU, error for incompatible types (one torch.cuda.FloatTensor, the other torch.FloatTensor). Should also be an easy fix with nnModule.to(device).
REVIEWs
Look for REVIEW tags in transforms.my_collate_fn. Make sure the new interface does the same thing as the old one.
In getInputChar in airbus_attention_vtlp_CTC.py, make sure the new call to torch.bernoulli is okay.
Document everything @ShangwuYao. Add docstrings (e.g. Numpy standard); give more descriptive function names (e.g. not my_collate_fn).
Move generic functions to audlib.nn. ASR-specific NNs can be put into audlib.nn.asr.
Resolve dependency to sklearn.ParameterSampler. We can implement this if it's not complicated.
Consider refactoring model_data_optim.
my_collate_fn probably needs a rewrite for clarity. The pattern
This issue focuses on debugging
airbus_attention_vtlp_CTC.py
that first appears in commit 1857440 @ShangwuYao. Leave the major changes you did in the next section, and anything else to be done in the to-do section.Major Changes in Commit 05eb1ec
By @raymondxyy.
Moved
airbus_attention_vtlp_CTC.py
fromaudlib.nn
toegs/asr
.Moved common DNN functions to
audlib.nn.nn
. Check "Refactoringnn
Module" issue for more info.Refactored data-related functions to
egs/asr/dataset.py
. The script only needs successful installation ofaudlib
and a downloaded WSJ dataset (You can download from LDC's catalog using @raymondxyy's account) to work properly. Make sure this runs fine before you proceed.Refactored transforms and collate functions to
egs/asr/transforms.py
.Removed
WSJ
class that loads all training/validation/test data in one shot. Replaced it withWSJ0
andASRWSJ0
(WSJ1 is also available) inaudlib.data.wsj
. Seeegs/asr/dataset.py
.Removed this code block for vocabulary handling in
main
:The string-to-vocab-to-label-to-vocab-to-string pipeline can be done using
CharacterMap
orPhonemeMap
inaudlib.asr.util
. Seedataset.py
for an example of how this is done.Removed
myDataset
because it only prepends or appends special characters to training transcripts. This step is now done as part of transform intransforms.FinalTransform
.CUDA availability is checked in the beginning of main functions and tensors get transferred to
device
as part of the new collate function:All code that transfers tensors to CUDA that are not in the main functions are therefore commented out. It doesn't make much sense for data to keep switching device after they enter NNs, unless there are packages that require processing on a particular device. This will be part of the TODOs.
Vocabulary histogram is now available as part of the dataset, so no need to do it manually in main. See
VOCAB_HIST
indataset.py
.Disabled all
to_variable
code as it's deprecated according to PyTorch.Removed dependency to
python-levenshtein
. The Levenshtein (edit) distance function is implemented inaudlib.asr.utils.levenshtein
.TODOs
BUGs
grid_search
to work with new interface.beamsearch
to work with new interface.torch.cuda.FloatTensor
, the othertorch.FloatTensor
). Should also be an easy fix withnnModule.to(device)
.REVIEWs
transforms.my_collate_fn
. Make sure the new interface does the same thing as the old one.getInputChar
inairbus_attention_vtlp_CTC.py
, make sure the new call totorch.bernoulli
is okay.my_collate_fn
).audlib.nn
. ASR-specific NNs can be put intoaudlib.nn.asr
.sklearn.ParameterSampler
. We can implement this if it's not complicated.model_data_optim
.my_collate_fn
probably needs a rewrite for clarity. The patternThe text was updated successfully, but these errors were encountered: