You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying to train an unconditional diffusion model on grayscale images using your pipeline. When running training with the default parameters I discovered inferred images that contained colour (specifically green). Where it learnt such colours from I do not know but I would predict the issue lies within the initial processing of the image set:
images = [augmentations(image.convert("RGB")) for image in examples["image"]]
as such I created a fork of this repo and changed this line to:
images = [augmentations(image.convert("L")) for image in examples["image"]]
I also updated the model configuration (UNet2DModel) to work with single-channel inputs and outputs by setting in_channels=1 and out_channels=1 when initialising the model.
Am I on the right track? or does the resolution lie elsewhere? I also noticed the resolution of the inferred images is very poor; not on par with the training set. What parameters can I adjust to improve this? Ultimately I am interested in a diffusion model that focuses more on the textural composition of images, rather than the colour.
The text was updated successfully, but these errors were encountered:
I am trying to train an unconditional diffusion model on grayscale images using your pipeline. When running training with the default parameters I discovered inferred images that contained colour (specifically green). Where it learnt such colours from I do not know but I would predict the issue lies within the initial processing of the image set:
images = [augmentations(image.convert("RGB")) for image in examples["image"]]
as such I created a fork of this repo and changed this line to:
images = [augmentations(image.convert("L")) for image in examples["image"]]
I also updated the model configuration (UNet2DModel) to work with single-channel inputs and outputs by setting
in_channels=1
andout_channels=1
when initialising the model.Am I on the right track? or does the resolution lie elsewhere? I also noticed the resolution of the inferred images is very poor; not on par with the training set. What parameters can I adjust to improve this?
Ultimately I am interested in a diffusion model that focuses more on the textural composition of images, rather than the colour.
The text was updated successfully, but these errors were encountered: