Support for training with Grayscale images? #10677

DavidGill159 · 2025-01-28T22:25:19Z

I am trying to train an unconditional diffusion model on grayscale images using your pipeline. When running training with the default parameters I discovered inferred images that contained colour (specifically green). Where it learnt such colours from I do not know but I would predict the issue lies within the initial processing of the image set:

images = [augmentations(image.convert("RGB")) for image in examples["image"]]

as such I created a fork of this repo and changed this line to:

images = [augmentations(image.convert("L")) for image in examples["image"]]

I also updated the model configuration (UNet2DModel) to work with single-channel inputs and outputs by setting in_channels=1 and out_channels=1 when initialising the model.

Am I on the right track? or does the resolution lie elsewhere? I also noticed the resolution of the inferred images is very poor; not on par with the training set. What parameters can I adjust to improve this?
Ultimately I am interested in a diffusion model that focuses more on the textural composition of images, rather than the colour.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for training with Grayscale images? #10677

Support for training with Grayscale images? #10677

DavidGill159 commented Jan 28, 2025 •

edited

Loading

Support for training with Grayscale images? #10677

Support for training with Grayscale images? #10677

Comments

DavidGill159 commented Jan 28, 2025 • edited Loading

DavidGill159 commented Jan 28, 2025 •

edited

Loading