关于 some Natural Speech Features Of Microsoft代码 #92

chusevip8 · 2023-06-17T02:31:19Z

作者好，关于 some Natural Speech Features Of Microsoft
这部分的优化代码是哪一部分呢，没有找到，请指示一下。

MaxMax2016 · 2023-06-17T02:52:16Z

z_p = self.flow(z, y_mask, g=g)
z_r = m_p + torch.randn_like(m_p) * torch.exp(logs_p)
z_r = self.flow(z_r, y_mask, g=g, reverse=True)
return o, l_length, attn, ids_slice, x_mask, y_mask, (z, z_p, z_r, m_p, logs_p, m_q, logs_q)

            loss_kl = kl_loss(z_p, logs_q, m_p, logs_p, z_mask) * hps.train.c_kl
            if z_r == None:
                loss_kl_r = 0
            else:
                loss_kl_r = kl_loss(z_r, logs_p, m_q, logs_q, z_mask) * hps.train.c_kl
            loss_fm = feature_loss(fmap_r, fmap_g)
            loss_gen, losses_gen = generator_loss(y_d_hat_g)
            loss_gen_all = loss_gen + loss_fm + loss_mel + loss_dur + loss_kl + loss_kl_r

nshmyrev · 2024-01-02T23:50:52Z

Hey @MaxMax2016 thanks for the code. I've tried to play with the current implementation a bit and honestly it doesn't really work as intended. Here are the reasons:

It needs weights in loss (usually has to be much smaller than 1.0) and also Gaussian weight similar to inference (noise_scale)

z_r = m_p + torch.randn_like(m_p) * torch.exp(logs_p) * noise_scale

Because sound is time-variable it needs SoftDTW for KL loss otherwise it pushes to make speech very uniform. Paper mentions that.

Without SoftDTW after loss is applied automated evaluation CER goes down, Mel loss goes significantly up and Frechet score also goes significantly up. This is because speech is not following target audio anymore.

More advanced implementation of backward loss is here: heatz123/naturalspeech#12 but also not straigth to make it work.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

关于 some Natural Speech Features Of Microsoft代码 #92

关于 some Natural Speech Features Of Microsoft代码 #92

chusevip8 commented Jun 17, 2023

MaxMax2016 commented Jun 17, 2023

nshmyrev commented Jan 2, 2024

关于 some Natural Speech Features Of Microsoft代码 #92

关于 some Natural Speech Features Of Microsoft代码 #92

Comments

chusevip8 commented Jun 17, 2023

MaxMax2016 commented Jun 17, 2023

nshmyrev commented Jan 2, 2024