Audio stream / frame-based processing #584
-
Hello, I'm trying to implement frame-by-frame processing of a trained I've tried the Digging more into this, I noticed that the LSTM layer's Maybe other aspects of the encoders/decoders need to be re-implemented as well for frame-based processing, or am I missing something obvious? Thanks! |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 2 replies
-
Hello, You are right in order to use a network for streaming applications your network has to be causal ( The network doesn't need information from the future to predict the present) and stateful (the network keeps a sort of a memory of the previous chunks to process the current one). When this is done you will not have any difference between a file process all at once or by chunks.
Note that to keep this harmony between overlapping part etc the size of the chunks that you process must be chosen according to the network parameters. Sorry if this isn't really clear but this hard to explain just with words. My advice is to take a sheet of paper and to the operation by hand to see what data to buffer where to pad etc. |
Beta Was this translation helpful? Give feedback.
-
@ EliasLum, I received a notification that you posted something? If you resolved your issue, it's better to add the solution here as well than removing the comment altogether 😉 |
Beta Was this translation helpful? Give feedback.
Hello,
You are right in order to use a network for streaming applications your network has to be causal ( The network doesn't need information from the future to predict the present) and stateful (the network keeps a sort of a memory of the previous chunks to process the current one). When this is done you will not have any difference between a file process all at once or by chunks.
So, in order to process chunks efficiently with
DCCRNet
you would have to make the encoders masker decoders causal and stateful.To do so :