Why using separable conv in C1 and C2 instead of normal conv1d? #2

mohsen-goodarzi · 2022-09-01T07:30:18Z

Thank you for sharing your great work.

I noticed you have used sepconv_bn in C1 and C2 instead of conv_bn_act.
Is it on purpose? Does it give better results?

QuartzNet-ASR-pytorch/model.py

Line 49 in ec6073e

self.c1 = sepconv_bn(n_mels, 256, kernel_size=33, stride=2)

The text was updated successfully, but these errors were encountered:

Kirili4ik · 2022-09-01T07:57:17Z

Hi,
Separable convolutions is a trick described in the paper of QuartzNet. Shortly, it uses less parameters achieving pretty same results (so it makes the model smaller and faster for on-device inference)

mohsen-goodarzi · 2022-09-01T09:06:53Z

I see. 👍
I thought they just used separable conv in B blocks.
Thanks for fast reply.

Kirili4ik · 2022-09-01T09:15:52Z

As far as I remember, it can be unclear in the paper about the blocks where sepconvs are used. But we have tried to fully reproduce the paper and the number of the parameters of the model is known. If I remember correctly we tried using sepconvs everywhere to get the same number of the parameters as described in the paper and it worked.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why using separable conv in C1 and C2 instead of normal conv1d? #2

Why using separable conv in C1 and C2 instead of normal conv1d? #2

mohsen-goodarzi commented Sep 1, 2022

Kirili4ik commented Sep 1, 2022 •

edited

Loading

mohsen-goodarzi commented Sep 1, 2022

Kirili4ik commented Sep 1, 2022

Why using separable conv in C1 and C2 instead of normal conv1d? #2

Why using separable conv in C1 and C2 instead of normal conv1d? #2

Comments

mohsen-goodarzi commented Sep 1, 2022

Kirili4ik commented Sep 1, 2022 • edited Loading

mohsen-goodarzi commented Sep 1, 2022

Kirili4ik commented Sep 1, 2022

Kirili4ik commented Sep 1, 2022 •

edited

Loading