Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why using separable conv in C1 and C2 instead of normal conv1d? #2

Open
mohsen-goodarzi opened this issue Sep 1, 2022 · 3 comments

Comments

@mohsen-goodarzi
Copy link

Thank you for sharing your great work.

I noticed you have used sepconv_bn in C1 and C2 instead of conv_bn_act.
Is it on purpose? Does it give better results?

self.c1 = sepconv_bn(n_mels, 256, kernel_size=33, stride=2)

@Kirili4ik
Copy link
Owner

Kirili4ik commented Sep 1, 2022

Hi,
Separable convolutions is a trick described in the paper of QuartzNet. Shortly, it uses less parameters achieving pretty same results (so it makes the model smaller and faster for on-device inference)

@mohsen-goodarzi
Copy link
Author

I see. 👍
I thought they just used separable conv in B blocks.
Thanks for fast reply.

@Kirili4ik
Copy link
Owner

As far as I remember, it can be unclear in the paper about the blocks where sepconvs are used. But we have tried to fully reproduce the paper and the number of the parameters of the model is known. If I remember correctly we tried using sepconvs everywhere to get the same number of the parameters as described in the paper and it worked.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants