Skip to content

Audio stream / frame-based processing #584

Answered by JorisCos
georgezachos asked this question in Q&A
Discussion options

You must be logged in to vote

Hello,

You are right in order to use a network for streaming applications your network has to be causal ( The network doesn't need information from the future to predict the present) and stateful (the network keeps a sort of a memory of the previous chunks to process the current one). When this is done you will not have any difference between a file process all at once or by chunks.
So, in order to process chunks efficiently with DCCRNet you would have to make the encoders masker decoders causal and stateful.
To do so :

  • Make the 2D convolutions that feed on the encoded representation causal by padding on the left side.
  • Make it stateful by keeping a buffer with the data needed for the nex…

Replies: 2 comments 2 replies

Comment options

You must be logged in to vote
1 reply
@georgezachos
Comment options

Answer selected by georgezachos
Comment options

You must be logged in to vote
1 reply
@EliasLum
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
4 participants