-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
what changes would we need to do if we used our own dataset? #1
Comments
Hello, Thanks for the appreciation. I apologize that should have been part of the README , I have updated it now. |
Can you tell me the im_path value you used in the config ? The error basically means that the code wasn't able to find any png files in the location it was searching. |
Got it. Create a subfolder 'images' inside train directory and put all training png files in there. Leave the config as it is to point to "ultrasound256CH1/train" . |
yes i tried, unfortunately it does not work. |
Can you print the directory and path the code is searching at https://github.com/explainingai-code/DDPM-Pytorch/blob/main/dataset/mnist_dataset.py#L40 and share that.
|
You are training on cpu as of now right ? |
hi sir, i kept the batch size 10, just want to run for 40 epochs and the total images are only 828. could you pleases tell me why the model required so heavy computational power(memory) and how can i handle this issue? RuntimeError: CUDA out of memory. Tried to allocate 640.00 GiB (GPU 0; 14.75 GiB total capacity; 2.16 GiB already allocated; 11.63 GiB free; 2.17 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF. |
Its because the images are 256x256 and by default the model config does downsampling only twice.
I think that should reduce the model size considerably and should allow you to train. |
Thanks, the model works very well i trained on my datasets. |
Yes, I wanted to have this repo as a intro to diffusion which is why didn't want to add those and leave this as a bare minimum diffusion repo. I do plan to create a stable diffusion repo which should have some of these incorporated. Once that is done I will try to put the parts you mentioned here as well (if I am able to do that without adding too much complexity to the current implementation ). |
Are you using the same code or have you made some modifications ? Your list at the end of dataset initialization is a list of numpy.ndarray objects(according to the error), which cannot be because the dataset class during initialization just fetches the filenames. |
The shapes of two images that your dataset returns are different (3x3264x2448 and 3x2448x3264). |
How would i fix that? |
I dont think the 3xwxh is an issue because the error says that your image shapes are 3xWxh so thats fine. But I think your path does not have all same size images. Some images are 3264x2448 and some are 2448x3264 . |
Yes, I think your right so the solution would be to downsample all of them to 64x64? |
Yes center square crop of (2448x2448) and then resize to 64x64. |
Around 600 images |
Couple of things. Move the image reading to the data loader get_item method just like the code in repo. Simply collect the filenames from load_images method and nothing else. You can do the cropping and resize also in get_item method. |
Okay so first of all i should leave the loader function as it is just modify for the jpg images, secondly i should do the image formatting in the get item function |
How do you suggest i fix this? torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 4.75 GiB. GPU 0 has a total capacity of 14.75 GiB of which 57.06 MiB is free. Process 15150 has 14.69 GiB memory in use. Of the allocated memory 11.21 GiB is allocated by PyTorch, and 3.35 GiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables) |
Thats something you would have to experiment with and tune but I am assuming you are anyway limited by compute and 4 is the max you can go ? |
Yes 4 is the highest I can go, I will obviously need help in tuning the model parameters for better results so I will keep you updated. Thank you so much for the help in the mean time. |
Is there any way i can save my progress while training, what i want to do is say train up to 130 epochs stop my training and then continue training from 130 epochs again? |
I would also like to mention that as of now my training time per epoch is 40s so to train 100 epochs it took me a little over an hour, I remember in your yt video you trained your model for around 3 hours on 60 epochs, so my dataset might also be a limiting factor over here(257 images after split) |
->Is there any way i can save my progress while training, what i want to do is say train up to 130 epochs stop my training and then continue training from 130 epochs again?
|
So
Are the checkpoints saved in the .pth file? So I would have to transfer that file before starting training then |
Yes download the .pth file after one iteration of training, and put it back in the necessary path before starting the second iteration. |
Okay, I will work on the suggestions, in your opinion what is the better alternative |
I would say a larger dataset is beneficial and then using cropping you can further increase the images that diffusion model gets to see during training. |
these are my images after crop and resize in the data loader function are these okay?, i think they have lost a lot of quality Also i do i have a larger dataset around 45k images but the images in it are to inconsistent in every manner size, quality |
For the 45K images, my guess is that the centre crop and resize to 64x64 should handle the inconsistencies in size(I dont know how inconsistent they are in quality). But if you have 45K iamges, I would say why not try with that just to see how good of an output quality you get from ddpm. |
Yes that is the goal right, But training those images will take a considerate amount of time and if the output is still noisy, all the time spent will have been wasted Also are the black outputs normal, currently I am saving my progress and increasing epochs on the 257 images, but on 130 epochs the final outputs were mostly black so don't they indicate the final noise less images will be black or is this part of the process?
|
Yes diffusion models require decent amount of data to train so unless you throw in more compute power, I dont see any other way to reduce the time. |
Great. I dont think that assumption is correct, once your model has converged(looks like that point maybe somewhere around 1000 epochs) it will not have these noisy images at all. And with 45K images around after 200 epochs you will most likely be able to see decent outputs(obv you will have to experiment to assert that). For higher resolution, you can try with pre-trained super resolution models(just searching on the web should give you some models and you can test them to see how good of a result you are getting with that. |
Hello its me again |
Hello :) , Yes the diffusion model can be conditioned on class(using class embeddings) or text(using cross attention). But this repo does not have that , this one only allows you to generate images unconditionally. |
@dangdinh17 Can you tell me what error you are facing ? Out of memory ? |
Yeah, can you try with 64x64 . In config set im_size as 64 and in dataset class's getitem method, resize the image to 64x64. |
i have tried with 64x64 and it worked. but i want to train with the shape 128x128 because of my study. let me introduce about my study and can you give me some essential suggests, please?
|
If your images are from a dataset(or match a dataset) that has a super resolution model available, then you can train DDPM on 64x64 images and then use that super resolution model checkpoint to get 128x128 image. Or you can train a super resolution model yourself. |
oh yes, i see, thank you so much |
i have another question that, if my data is about motion blur or exposure blur, so if i only use the original code for this blur, does it work? or i must train the model with adding more noise like motion blur and light blur rather than only gaussian noise? |
@dangdinh17 , apologies for the late reply. But I have responded to your issue on Stable Diffusion repo. Do take a look at the repo mentioned in that reply - explainingai-code/StableDiffusion-PyTorch#21 (comment) . I think that implementation does exactly what you need. |
@dangdinh17 Hey, just dont allow the network to compute attentions on resolution 128. Only in lower resolution. This should solve your memory issues. |
Thanks for the awesome explanation. Could you tell me which changes we need before training the model on our data?
The text was updated successfully, but these errors were encountered: