You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thank you for sharing the code from your insightful paper!
I'm attempting to train the model using the end-to-end (e2e) setup, and I've encountered an issue related to embeddings. As I understand, you're utilizing the TextDataset_NoCache class for the dataset, which comprises the model's embedding.
However, in the e2e setup, it seems logical that one would want to use the continuously updated embeddings from the model. Looking through the training loop, I couldn't find any indication that the embeddings are updated from the model after each gradient step. Could you please shed light on how it's feasible to train embeddings end-to-end when the embeddings are housed within the dataset class?
Thank you for your time and clarification!
The text was updated successfully, but these errors were encountered:
elephantmipt
changed the title
Reproduction of E2E training procedure
E2E training procedure
Aug 17, 2023
Hi,
Thank you for sharing the code from your insightful paper!
I'm attempting to train the model using the end-to-end (e2e) setup, and I've encountered an issue related to embeddings. As I understand, you're utilizing the
TextDataset_NoCache
class for the dataset, which comprises the model's embedding.Diffusion-LM/improved-diffusion/improved_diffusion/text_datasets.py
Lines 815 to 828 in 759889d
In the training script, you're passing
model=None
to theload_data_text
function.Diffusion-LM/improved-diffusion/scripts/train.py
Lines 81 to 105 in 759889d
I assume that the embeddings are initialized at:
https://github.com/XiangLi1999/Diffusion-LM/blob/main/improved-diffusion/improved_diffusion/text_datasets.py
However, in the e2e setup, it seems logical that one would want to use the continuously updated embeddings from the model. Looking through the training loop, I couldn't find any indication that the embeddings are updated from the model after each gradient step. Could you please shed light on how it's feasible to train embeddings end-to-end when the embeddings are housed within the dataset class?
Thank you for your time and clarification!
The text was updated successfully, but these errors were encountered: