Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to condition based on multiple features? #14

Open
vinodrajendran001 opened this issue May 14, 2024 · 2 comments
Open

How to condition based on multiple features? #14

vinodrajendran001 opened this issue May 14, 2024 · 2 comments

Comments

@vinodrajendran001
Copy link

I would like to condition the model using multiple features. In my case, I have lot of columns say A, B, C and D and some of the columns are categorical and some are numerical. Now I wanted to implement stable diffusion by conditioning on all the columns together. Please advise what are the modifications I need to do.

Thanks.

@explainingai-code
Copy link
Owner

Hello @vinodrajendran001,

I think based on your requirement you can use the class embeddings(for categorical) and timestep embeddings for numerical conditioning.

Lets go through the categorical conditions first. Assume you have two categorical variables A/B both having 3 classes each(A1/A2/A3 and B1/B2/B3).
You can either use two seperate class embeddings for each of the two here(https://github.com/explainingai-code/StableDiffusion-PyTorch/blob/main/models/unet_cond_base.py#L61). So class_a_emb and class_b_emb and then based on your input add the the conditioning embedding values to the timestep embedding here (https://github.com/explainingai-code/StableDiffusion-PyTorch/blob/main/models/unet_cond_base.py#L155). So for this case rather than just adding class_emb you would add a_emb and b_emb values both.

You could also combine the classes(assuming every data point has values for both classes). So you have just one class embedding but your embedding values are for (A1B1, A1B2 , A1B3, ..., A3B1, A3B2, A3B3). This will make the changes simpler(infact I think everything will work out of the box and no modification is needed) but it assumes you have values for both classes for all training data and also during generation you would not be able to generate say an image with only one condition A1.

For numerical cases, you can convert them to positional embeddings using(https://github.com/explainingai-code/StableDiffusion-PyTorch/blob/main/models/blocks.py#L5) and then use something like timestep embeddings(https://github.com/explainingai-code/StableDiffusion-PyTorch/blob/main/models/unet_cond_base.py#L148C9-L149C35). With different t_proj layer for timesteps and numerical conditioning field. And then just like class embedding add that to the timestep embedding. Though if your numerical field values do not cover entire range from min to max values then it might make sense to rather bin them and convert them also to classes only.

So assuming one additional numerical condition and the above two classes.
Ultimately you would do this.

time_step_emb = time_step_emb + class_emb + numerical_emb

And then this timestep_emb would be passed to downblocks and upblocks here (https://github.com/explainingai-code/StableDiffusion-PyTorch/blob/main/models/unet_cond_base.py#L167)

@vinodrajendran001
Copy link
Author

That's a good idea. Let me try to translate your inputs into code and experiment.

Thanks :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants