-
Notifications
You must be signed in to change notification settings - Fork 90
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement p_mode="per_example" in Compose() #90
Comments
A follow-up question. What would be the best practice to apply a sequence of augmentation to the examples in a batch while varying the randomized parameters per example? |
Hi keunwoochoi :) Thanks for the appreciation. Sorry for the confusion. Please let me try to explain.
For example mode="per_channel" means that each channel gets augmented independently (with different parameters) mode="per_example" means that every piece of audio (which can be multichannel or mono) gets augmented independently - this is what one typically wants. mode="per_batch" means that all the audio snippets in a batch get augmented in the same way. p_mode refers to the behavior of "p", the probability of applying the transform. p_mode="per_batch" together with e.g. p=0.5 means that a transform will be applied to only 50% of the batches on average. I.e. ~50% of the time you call it, it will be a no-op (it will do nothing). p_mode="per_example" together with p=0.5 means that a transform will be applied to 50% of the examples (audio snippets) in a batch on average. The others will be left untouched. p_mode="per_channel" together with p=0.5 means that the transform will be applied to 50% of the channels on average. We can think of I haven't defined Maybe I should remove p_mode in Compose to make it less confusing? I'm not sure if I'll ever implement p_mode!="per_batch" in |
I'm not sure what the best practice is. I guess that depends on the application. But you could do something like what is mentioned in readme:
In this case, 50 % of the examples (AKA audio snippets) will get gained and 50 % of the examples (AKA audio snippets) will get polarity-inversed. The two probabilities are independent. The gain values will be different for every example that gets gained. I would advice you to play around with it. If you want, you can give feedback and/or contributions to the project to make it better, in the spirit of open source, community-driven projects 😄 |
By the way, there is a demo script that applies various transforms in all three modes (per_batch, per_example and per_channel) and writes the results to wav. Listening to these output audio files can help understand what is going on. Here's the script: https://github.com/asteroid-team/torch-audiomentations/blob/master/scripts/demo.py |
Thanks for all the answers! Knowing the difference between
I think the function is definitely useful! Maybe all we need is |
(I drew the image at www.draw.io. You can open this file there https://www.dropbox.com/s/taapi8jaskts6yx/torch-audiomentation?dl=0) |
Nice visualization :) Should we add it to readme for now? Feel free to make a pull request. I have not started setting up proper documentation yet. |
|
p_mode="per_example" is the most relevant in most cases
Yes, those three on the bottom should say p_mode="per_example" to be correctly aligned with the illustrations 👍 |
Agree that Related to that, I think |
You're probably right :) Maybe I thought about it briefly when I initially coded it and thought "this is possible, but I'll leave it as a TODO for later". |
per_example
supported or not?
Thumbs up for implementing p_mode = "per_example" from me, would be very helpful. Thanks for an excellent package! |
I'm glad you like it :) If you want to make a contribution, that would be welcome |
Hi, thanks for this great software!
Is
per_example
supported currently or not? With theValueError
raised inCompose
(https://github.com/asteroid-team/torch-audiomentations/blob/master/torch_audiomentations/core/composition.py#L30), I assume it is not supported inCompose
. But the readme says it is supported - does it mean that it's supported in individual transforms but not inCompose
?Maybe it's worth using it in the example code in
readme
:)The text was updated successfully, but these errors were encountered: