[Bug] FormatShape bug for optical flow #2630

makecent · 2023-08-08T02:39:48Z

Branch

main branch (1.x version, such as v1.0.0, or dev-1.x branch)

Prerequisite

I have searched Issues and Discussions but cannot get the expected help.
I have read the documentation but cannot get the expected help.
The bug has not been fixed in the latest version.

Environment

N/A

Describe the bug

As the Normalization operation now is moved from the pipeline to the data preprocessor. The necessary processing step for optical flows, which is originally in the Normalize pipeline in the older version, is missing:

In older version, the Normalize pipeline stacks the flow_x and flow_y:

mmaction2/mmaction/datasets/pipelines/augmentations.py

Lines 1207 to 1232 in 02d5d9b

    
           if modality == 'Flow': 
        
               num_imgs = len(results['imgs']) 
        
               assert num_imgs % 2 == 0 
        
               assert self.mean.shape[0] == 2 
        
               assert self.std.shape[0] == 2 
        
               n = num_imgs // 2 
        
               h, w = results['imgs'][0].shape 
        
               x_flow = np.empty((n, h, w), dtype=np.float32) 
        
               y_flow = np.empty((n, h, w), dtype=np.float32) 
        
               for i in range(n): 
        
                   x_flow[i] = results['imgs'][2 * i] 
        
                   y_flow[i] = results['imgs'][2 * i + 1] 
        
               x_flow = (x_flow - self.mean[0]) / self.std[0] 
        
               y_flow = (y_flow - self.mean[1]) / self.std[1] 
        
               if self.adjust_magnitude: 
        
                   x_flow = x_flow * results['scale_factor'][0] 
        
                   y_flow = y_flow * results['scale_factor'][1] 
        
               imgs = np.stack([x_flow, y_flow], axis=-1) 
        
               results['imgs'] = imgs 
        
               args = dict( 
        
                   mean=self.mean, 
        
                   std=self.std, 
        
                   to_bgr=self.to_bgr, 
        
                   adjust_magnitude=self.adjust_magnitude) 
        
               results['img_norm_cfg'] = args 
        
               return results

In 1.x version, the stacking operation is lost as the Normalize pipeline is no longer used. Causing the dimension error in FormatShape:

  File "/home/louis/miniconda3/envs/mmengine/lib/python3.8/site-packages/mmaction/datasets/transforms/formatting.py", line 260, in transform
    imgs = np.transpose(imgs, (0, 1, 5, 2, 3, 4))
  File "<__array_function__ internals>", line 180, in transpose
  File "/home/louis/.local/lib/python3.8/site-packages/numpy/core/fromnumeric.py", line 660, in transpose
    return _wrapfunc(a, 'transpose', axes)
  File "/home/louis/.local/lib/python3.8/site-packages/numpy/core/fromnumeric.py", line 57, in _wrapfunc
    return bound(*args, **kwds)
ValueError: axes don't match array

Reproduces the problem - code sample

No response

Reproduces the problem - command or script

No response

Reproduces the problem - error message

No response

Additional information

No response

The text was updated successfully, but these errors were encountered:

Dai-Wenxun · 2023-08-11T09:16:21Z

For the formatting of the optical flow, you need to set the input_format of FormatShape as NCHW_Flow, as shown here. The code of this branch will stack the optical flow at the last dimension as in your #2631.

Dai-Wenxun · 2023-08-11T09:23:34Z

For the normalization of optical flow, I think we can implement it as follows:

clip_len = 5
format_shape='NCHW_flow'

model = dict(
    type='Recognizer2D',
    backbone=dict(...),
    cls_head=dict(...),
    data_preprocessor=dict(
        type='ActionDataPreprocessor',
        mean=[128, 128] *clip_len,
        std=[128, 128] * clip_len,
        format_shape=format_shape))

train_pipeline = [
    dict(type='SampleFrames', clip_len=clip_len, frame_interval=1, num_clips=3),
   ....
    dict(type='FormatShape', input_format=format_shape),
   ...
]

Dai-Wenxun · 2023-08-11T09:26:57Z

Since the NCHW_flow is not defined in the ActionDataPreprocessor, could you please help us to implement it in action2? The functionality of NCHW_flow should be equivalent to that of NCHW.

Dai-Wenxun · 2023-08-11T09:29:31Z

of course, if you have any better ideas to process the optical flow, feel free to let me know. Thank u, bro!

makecent · 2023-08-11T12:49:18Z

@Dai-Wenxun I am a little confused about the NCHW_Flow: why it does not contain a T dimension? In my understanding, the format of optical flows should be the same with RGB frames , i.e., using the NCTHW, albeit C=3 for RGB, and C=2 for Flow.

As for a better idea, I think my PR #2631 is simple and effective. To work with optical flows, we can just simply set the FormatShape as NCTHW like RGBs, and as for the normalization, we can set a 2D mean = [x, x] and 2D std=[x, x]. I have tested my PR on working optical flows and it worked.

mm-assistant bot assigned Dai-Wenxun Aug 8, 2023

makecent mentioned this issue Aug 8, 2023

[Enhance] Support 2D&3D Optical Flow Training #2631

Merged

Dai-Wenxun closed this as completed Aug 15, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] FormatShape bug for optical flow #2630

[Bug] FormatShape bug for optical flow #2630

makecent commented Aug 8, 2023

Dai-Wenxun commented Aug 11, 2023

Dai-Wenxun commented Aug 11, 2023 •

edited

Loading

Dai-Wenxun commented Aug 11, 2023

Dai-Wenxun commented Aug 11, 2023

makecent commented Aug 11, 2023

[Bug] FormatShape bug for optical flow #2630

[Bug] FormatShape bug for optical flow #2630

Comments

makecent commented Aug 8, 2023

Branch

Prerequisite

Environment

Describe the bug

Reproduces the problem - code sample

Reproduces the problem - command or script

Reproduces the problem - error message

Additional information

Dai-Wenxun commented Aug 11, 2023

Dai-Wenxun commented Aug 11, 2023 • edited Loading

Dai-Wenxun commented Aug 11, 2023

Dai-Wenxun commented Aug 11, 2023

makecent commented Aug 11, 2023

Dai-Wenxun commented Aug 11, 2023 •

edited

Loading