OutOfMemoryError #15

Tefor · 2023-08-21T07:45:14Z

Hi author, your code is great, but when I introduce your SMT module for training I always have a case of not enough memory in the attn = (q @ k.transpose(-2, -1)) * self.scale statement when calculating the Attention, and it's not enough for me to set the Batchsize to 1. Can the author give some ideas how to modify it, please. I'm only using stage3's structure

AFeng-x · 2023-08-22T09:21:31Z

Hi, here are two simple ways u can try:
(1) reduce the number of channel (eg. 256->128)
(2) reduce the number of block (eg. 12->6)
Also, you need to confirm whether the resolution at stage 3 is too high for your own task.

Tefor · 2023-08-24T08:17:54Z

Thank you for your answer, it is indeed a resolution problem, my input image resolution is 128128, so that when calculating the attention N=HW=16384, this is too big, I would like to ask the author why your attention calculation has to transform the input x's shape from (B,C,H,W) to (B,N,C)? This takes up so much memory to calculate the attention.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OutOfMemoryError #15

OutOfMemoryError #15

Tefor commented Aug 21, 2023

AFeng-x commented Aug 22, 2023

Tefor commented Aug 24, 2023

OutOfMemoryError #15

OutOfMemoryError #15

Comments

Tefor commented Aug 21, 2023

AFeng-x commented Aug 22, 2023

Tefor commented Aug 24, 2023