Adds sdxl's VAE decoder implementation #653

IanNod · 2024-12-06T02:13:31Z

No description provided.

KyleHerndon · 2024-12-06T14:19:02Z

sharktank/sharktank/models/vae/tools/diffuser_ref.py

+                    subfolder="vae",
+                )
+
+    def decode(self, inp):


You have a scaling factor parameter. You should use that instead of the constant if possible. Otherwise, maybe document why it is this value?

This is the diffusers reference code so is not using our HParams at all. They also do not do the scaling inside vae decode but in their pipeline as input here https://github.com/huggingface/diffusers/blob/18f9b990884883533491fc87f303e7305dc27d75/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py#L591. I added it to our implementation as forcing shortfin to handle it in the pipeline would introduce extra host/device roundtrips which would impact performance.

I will add comments to clarify here

You can probably replace img = 1 / 0.13025 * inp with img = img / self.vae.config.scaling_factor source

this is done differently in flux/sd3 where we have a shift factor:

img = img / self.ae.config.scaling_factor + self.ae.config.shift_factor

KyleHerndon · 2024-12-06T14:25:52Z

sharktank/sharktank/models/vae/tools/sample_data.py

+
+
+def get_random_inputs(dtype, device, bs: int = 2):
+    torch.random.manual_seed(42)


If someone calls this function multiple times with the same batch size, I think they get the exact same tensor, which does not sound like desirable behavior. I would leave seed setting to callers, like main/test scripts.

good catch. Left over from some debugging awhile ago. Will remove

KyleHerndon · 2024-12-06T14:26:56Z

sharktank/sharktank/ops/signatures.py

@@ -60,6 +60,7 @@
    "unflatten",
    "unshard",
    "unsqueeze",
+    "squeeze",


Let's keep this alphabetized.

KyleHerndon · 2024-12-06T14:34:47Z

sharktank/sharktank/models/vae/layers.py

+
+        # there is always at least one resnet
+        if resnet_time_scale_shift == "spatial":
+            # TODO


Could you fill out the TODO?
Also, seems to me like you might want your else branch to be an elif where you verify you're getting what you expect.

Each layer always has a resnet and specifically either ResnetBlockCondNorm2d for spatial time shift or ResnetBlock2D otherwise. No else needed

sharktank/sharktank/models/vae/layers.py

KyleHerndon · 2024-12-06T14:44:07Z

sharktank/sharktank/models/vae/tools/run_vae.py

+    else:
+        inputs = get_random_inputs(dtype=dtype, device=device, bs=args.bs)
+
+    if args.export:


When this branch is reached, it looks like VAE is not run, only exported. I would not expect such a thing of a file named run_vae.py. Could you at least add a TODO to move the export code to another file or put a print statement "VAE exported. Skipping execution" or something like that?

KyleHerndon · 2024-12-06T14:45:04Z

sharktank/sharktank/types/tensors.py

@@ -393,6 +393,11 @@ def unsqueeze(self, dim: int) -> "AnyTensor":

        return unsqueeze(self, dim)

+    def squeeze(self, dim: Optional[int] = None) -> "AnyTensor":


Likewise lets keep this alphabetized.

KyleHerndon · 2024-12-07T23:44:45Z

sharktank/sharktank/models/vae/layers.py

@@ -0,0 +1,261 @@
+# Copyright 2024 Advanced Micro Devices, Inc.


Just realized. This should probably be broken up and put in sharktank/sharktank/layers/

Yeah, it should. I kept it separate following punet layers and was planning to clean up both in a follow up PR

Moves irpa generation into vae setup

IanNod requested review from stellaraccident, rsuderman, monorimet and KyleHerndon December 6, 2024 02:13

IanNod force-pushed the vae branch 2 times, most recently from 3e6c40c to d967c01 Compare December 6, 2024 02:21

KyleHerndon approved these changes Dec 6, 2024

View reviewed changes

KyleHerndon reviewed Dec 7, 2024

View reviewed changes

IanNod added 3 commits December 9, 2024 10:15

Adds sdxl's VAE decoder implementation

85b3037

Assorted fixes and addressing comments

f95a0f0

Removes vae test pytest dependencies as it didn't order tests properly

524ec80

Moves irpa generation into vae setup

IanNod force-pushed the vae branch from 96a2f3e to 524ec80 Compare December 9, 2024 16:16

Black formating

a4e6a85

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adds sdxl's VAE decoder implementation #653

Adds sdxl's VAE decoder implementation #653

IanNod commented Dec 6, 2024

KyleHerndon Dec 6, 2024

IanNod Dec 6, 2024

monorimet Dec 6, 2024 •

edited

Loading

KyleHerndon Dec 6, 2024

IanNod Dec 6, 2024

KyleHerndon Dec 6, 2024

KyleHerndon Dec 6, 2024

IanNod Dec 6, 2024

KyleHerndon Dec 6, 2024

KyleHerndon Dec 6, 2024

KyleHerndon Dec 7, 2024

IanNod Dec 9, 2024



		def get_random_inputs(dtype, device, bs: int = 2):
		torch.random.manual_seed(42)

		@@ -393,6 +393,11 @@ def unsqueeze(self, dim: int) -> "AnyTensor":

		return unsqueeze(self, dim)

		def squeeze(self, dim: Optional[int] = None) -> "AnyTensor":

		@@ -0,0 +1,261 @@
		# Copyright 2024 Advanced Micro Devices, Inc.

Adds sdxl's VAE decoder implementation #653

Are you sure you want to change the base?

Adds sdxl's VAE decoder implementation #653

Conversation

IanNod commented Dec 6, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

monorimet Dec 6, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

monorimet Dec 6, 2024 •

edited

Loading