Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KeyError: 'RotatedFCOS: "CPMHead: \'vis_train_duration\'"' #1

Open
douling843 opened this issue Oct 10, 2024 · 13 comments
Open

KeyError: 'RotatedFCOS: "CPMHead: \'vis_train_duration\'"' #1

douling843 opened this issue Oct 10, 2024 · 13 comments

Comments

@douling843
Copy link

I followed the installation tutorial to run the project, but there is an error, what should I do about this error?

2024-10-10 14:02:38,142 - mmrotate - INFO - Set random seed to 1029784185, deterministic: False
Traceback (most recent call last):
File "/opt/conda/envs/POBBv2/lib/python3.8/site-packages/mmcv/utils/registry.py", line 69, in build_from_cfg
return obj_cls(**args)
File "/workspace/PointOBB-v2/mmrotate/models/dense_heads/cpm_head.py", line 89, in init
if kwargs.get('train_cfg')['vis_train_duration'] is not None:
File "/opt/conda/envs/POBBv2/lib/python3.8/site-packages/mmcv/utils/config.py", line 40, in missing
raise KeyError(name)
KeyError: 'vis_train_duration'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/opt/conda/envs/POBBv2/lib/python3.8/site-packages/mmcv/utils/registry.py", line 69, in build_from_cfg
return obj_cls(**args)
File "/workspace/PointOBB-v2/mmrotate/models/detectors/rotated_fcos.py", line 21, in init
super(RotatedFCOS, self).init(backbone, neck, bbox_head, train_cfg,
File "/workspace/PointOBB-v2/mmrotate/models/detectors/single_stage.py", line 35, in init
self.bbox_head = build_head(bbox_head)
File "/workspace/PointOBB-v2/mmrotate/models/builder.py", line 37, in build_head
return ROTATED_HEADS.build(cfg)
File "/opt/conda/envs/POBBv2/lib/python3.8/site-packages/mmcv/utils/registry.py", line 237, in build
return self.build_func(*args, **kwargs, registry=self)
File "/opt/conda/envs/POBBv2/lib/python3.8/site-packages/mmcv/cnn/builder.py", line 27, in build_model_from_cfg
return build_from_cfg(cfg, registry, default_args)
File "/opt/conda/envs/POBBv2/lib/python3.8/site-packages/mmcv/utils/registry.py", line 72, in build_from_cfg
raise type(e)(f'{obj_cls.name}: {e}')
KeyError: "CPMHead: 'vis_train_duration'"

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "tools/train.py", line 196, in
main()
File "tools/train.py", line 165, in main
model = build_detector(
File "/workspace/PointOBB-v2/mmrotate/models/builder.py", line 55, in build_detector
return ROTATED_DETECTORS.build(
File "/opt/conda/envs/POBBv2/lib/python3.8/site-packages/mmcv/utils/registry.py", line 237, in build
return self.build_func(*args, **kwargs, registry=self)
File "/opt/conda/envs/POBBv2/lib/python3.8/site-packages/mmcv/cnn/builder.py", line 27, in build_model_from_cfg
return build_from_cfg(cfg, registry, default_args)
File "/opt/conda/envs/POBBv2/lib/python3.8/site-packages/mmcv/utils/registry.py", line 72, in build_from_cfg
raise type(e)(f'{obj_cls.name}: {e}')
KeyError: 'RotatedFCOS: "CPMHead: 'vis_train_duration'"'
(POBBv2)

taugeren added a commit that referenced this issue Oct 11, 2024
taugeren added a commit that referenced this issue Oct 11, 2024
Fixes issue #1: config key error
@taugeren
Copy link
Collaborator

Thank you, and I have already fixed it. You can set "visualize=True" in "train_config" to visualize the CPM during training, and you need to modify "store_dir" to a specific directory.

@douling843
Copy link
Author

Thank you for your reply, after your fix this issue has been resolved, but a new problem has arisen. Since I have only one target category: [ship], I made the following changes:
in train_cpm_ssdd.py ==========>
base = [
'../base/datasets/ssdd.py', '../base/schedules/schedule_1x.py',
'../base/default_runtime.py'
]

data_root = 'data/ssdd/'

classes = ('ship', )

in cpm_head.py ============>
def get_mask_image(self, max_probs, max_indices, thr, num_width):

    PALETTE = [
        (0, 255, 0),
    ]
    mask_image = np.ones((num_width, num_width, 3), dtype=np.uint8) * 255
    for i in range(num_width):
        for j in range(num_width):
            if max_probs[i, j] > thr:
                mask_image[i, j] = PALETTE[max_indices[i, j]]
    return mask_image

============================================
So the following error occurred:

2024-10-11 08:22:54,412 - mmrotate - INFO - workflow: [('train', 1)], max: 6 epochs
2024-10-11 08:22:54,412 - mmrotate - INFO - Checkpoints will be saved to /workspace/PointOBB-v2/work_dirs/cpm_ssdd by HardDiskBackend.
/opt/conda/envs/POBBv2/lib/python3.8/site-packages/torch/functional.py:445: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ../aten/src/ATen/native/TensorShape.cpp:2157.)
return _VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
Traceback (most recent call last):
File "tools/train.py", line 196, in
main()
File "tools/train.py", line 185, in main
train_detector(
File "/workspace/PointOBB-v2/mmrotate/apis/train.py", line 144, in train_detector
runner.run(data_loaders, cfg.workflow)
File "/opt/conda/envs/POBBv2/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 136, in run
epoch_runner(data_loaders[i], **kwargs)
File "/opt/conda/envs/POBBv2/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 53, in train
self.run_iter(data_batch, train_mode=True, **kwargs)
File "/opt/conda/envs/POBBv2/lib/python3.8/site-packages/mmcv/runner/epoch_based_runner.py", line 31, in run_iter
outputs = self.model.train_step(data_batch, self.optimizer,
File "/opt/conda/envs/POBBv2/lib/python3.8/site-packages/mmcv/parallel/data_parallel.py", line 77, in train_step
return self.module.train_step(*inputs[0], **kwargs[0])
File "/opt/conda/envs/POBBv2/lib/python3.8/site-packages/mmdet/models/detectors/base.py", line 248, in train_step
losses = self(**data)
File "/opt/conda/envs/POBBv2/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1102, in _call_impl
return forward_call(*input, **kwargs)
File "/opt/conda/envs/POBBv2/lib/python3.8/site-packages/mmcv/runner/fp16_utils.py", line 119, in new_func
return old_func(*args, **kwargs)
File "/opt/conda/envs/POBBv2/lib/python3.8/site-packages/mmdet/models/detectors/base.py", line 172, in forward
return self.forward_train(img, img_metas, **kwargs)
File "/workspace/PointOBB-v2/mmrotate/models/detectors/single_stage.py", line 81, in forward_train
losses = self.bbox_head.forward_train(x, img_metas, gt_bboxes,
File "/opt/conda/envs/POBBv2/lib/python3.8/site-packages/mmdet/models/dense_heads/base_dense_head.py", line 335, in forward_train
losses = self.loss(*loss_inputs, gt_bboxes_ignore=gt_bboxes_ignore)
File "/opt/conda/envs/POBBv2/lib/python3.8/site-packages/mmcv/runner/fp16_utils.py", line 208, in new_func
return old_func(*args, **kwargs)
File "/workspace/PointOBB-v2/mmrotate/models/dense_heads/cpm_head.py", line 196, in loss
self.draw_image(img_metas[0]['filename'], img_metas[0]['flip_direction'], cls_scores[0][0].sigmoid())
File "/workspace/PointOBB-v2/mmrotate/models/dense_heads/cpm_head.py", line 148, in draw_image
conbine_image = self._draw_image(max_probs, max_indices, thr, flip, img_A, num_width)
File "/workspace/PointOBB-v2/mmrotate/models/dense_heads/cpm_head.py", line 121, in _draw_image
mask_image = self.get_mask_image(max_probs, max_indices, thr, num_width)
File "/workspace/PointOBB-v2/mmrotate/models/dense_heads/cpm_head.py", line 116, in get_mask_image
if max_probs[i, j] > thr:
IndexError: index 120 is out of bounds for dimension 0 with size 120

Please, how can I fix it?

@taugeren
Copy link
Collaborator

May I have a look on your config?

@douling843
Copy link
Author

base = [
'../base/datasets/ssdd.py', '../base/schedules/schedule_1x.py',
'../base/default_runtime.py'
]

data_root = 'data/ssdd/' ################################

store_dir = 'PointOBB-v2/your_absolute_vis_dir_ssdd/visualize' ##########################################PointOBB-v2/your_absolute_vis_dir_ssdd

angle_version = 'le90'

classes = ('ship', ) #####################################################

img_norm_cfg = dict(
mean=[123.675, 116.28, 103.53], std=[58.395, 57.12, 57.375], to_rgb=True)
train_pipeline = [
dict(type='LoadImageFromFile'),
dict(type='LoadAnnotations', with_bbox=True),
dict(type='RResize', img_scale=(608, 608)), ################ dict(type='RResize', img_scale=(1024, 1024)),
dict(
type='RRandomFlip',
flip_ratio=[0.25, 0.25, 0.25],
direction=['horizontal', 'vertical', 'diagonal'],
version=angle_version),
dict(type='Normalize', **img_norm_cfg),
dict(type='Pad', size_divisor=32),
dict(type='DefaultFormatBundle'),
dict(type='Collect', keys=['img', 'gt_bboxes', 'gt_labels'])
]

# data = dict(
# train=dict(pipeline=train_pipeline, version=angle_version),
# val=dict(version=angle_version),
# test=dict(version=angle_version))

data = dict(
train=dict(
pipeline=train_pipeline,
ann_file=data_root + 'train/labelTxt/',
img_prefix=data_root + 'train/images/',
version=angle_version,
classes=classes),
val=dict(
ann_file=data_root + 'test/all/labelTxt/',
img_prefix=data_root + 'test/all/images/',
version=angle_version,
classes=classes),
test=dict(
ann_file=data_root + 'test/all/labelTxt/',
img_prefix=data_root + 'test/all/images/',
version=angle_version,
classes=classes,
samples_per_gpu=4))

model settings

model = dict(
type='RotatedFCOS',
backbone=dict(
type='ResNet',
depth=50,
num_stages=4,
out_indices=(0, 1, 2, 3),
frozen_stages=1,
zero_init_residual=False,
norm_cfg=dict(type='BN', requires_grad=True),
norm_eval=True,
style='pytorch',
init_cfg=dict(type='Pretrained', checkpoint='torchvision://resnet50')),
neck=dict(
type='FPN',
in_channels=[256, 512, 1024, 2048],
out_channels=256,
start_level=0,
add_extra_convs='on_output', # use P5
num_outs=6,
relu_before_extra_convs=True),
# store_dir='rotated_fcos_r50_fpn_1x_dota_le90_2',
bbox_head=dict(
type='CPMHead',
num_classes=len(classes),
in_channels=256,
stacked_convs=4,
feat_channels=256,
regress_ranges=((-1, 32), (32, 64), (64, 128), (128, 256), (256, 512),
(512, 1e8)),
strides=[4, 8, 16, 32, 64, 128],
center_sampling=True,
center_sample_radius=1.5,
norm_on_bbox=True,
centerness_on_reg=True,
separate_angle=False,
scale_angle=True,
bbox_coder=dict(
type='DistanceAnglePointCoder', angle_version=angle_version),
loss_cls=dict(
type='FocalLoss',
use_sigmoid=True,
gamma=2.0,
alpha=0.25,
loss_weight=1.0),
loss_bbox=dict(type='RotatedIoULoss', loss_weight=1.0),
loss_centerness=dict(
type='CrossEntropyLoss', use_sigmoid=True, loss_weight=1.0)),
# training and testing settings
train_cfg=dict(
visualize=True, #######################################################visualize=False,
store_dir=store_dir,
cls_weight=1.0,
thresh1=6,
alpha=1.5
),
test_cfg=dict(
store_dir=store_dir,
nms_pre=2000,
min_bbox_size=0,
score_thr=0.05,
nms=dict(iou_thr=0.1),
max_per_img=2000))
find_unused_parameters = True
runner = dict(delete=True, type='EpochBasedRunner', max_epochs=6)
lr_config = dict(
delete=True,
policy='step',
warmup='linear',
warmup_iters=500,
warmup_ratio=1.0 / 3,
step=[4])
evaluation = dict(interval=6, metric='mAP')
optimizer = dict(lr=0.025) ##########################################

@taugeren
Copy link
Collaborator

Maybe you can try to set "dict(type='RResize', img_scale=(1024, 1024))" in "train_pipeline"

@douling843
Copy link
Author

Maybe you can try to set "dict(type='RResize', img_scale=(1024, 1024))" in "train_pipeline"

I gradually tried to set the size of the image on the small side, and when setting img_scale=(64, 64), the code is working, but it's not appropriate for us to set the size of the image so small, and we get AP=0.000. since the number of categories in my dataset is 1(ship), I think it may be different from the DOTA dataset's number of categories, and a certain parameter needs to be changed, but I don't know exactly which parameter.

@taugeren
Copy link
Collaborator

Did img_scale=(1024, 1024) works? and its normal to have a AP=0.000 on train_cpm_stage. You need also to generate pseudo labels and train detectors.

@douling843
Copy link
Author

douling843 commented Oct 11, 2024

Did img_scale=(1024, 1024) works? and its normal to have a AP=0.000 on train_cpm_stage. You need also to generate pseudo labels and train detectors.

when img_scale=(1024, 1024),=======>>>> IndexError: index 184 is out of bounds for dimension 0 with size 184.

img_scale=(1024, 1024) doesn't work, but as (1024, 1024)--->(512, 512)--->(256, 256)--->(128, 128)--->(64, 64), until (64, 64) code will work.

Isn't Training CPM the first step? This is the step where this error occurs.

@taugeren
Copy link
Collaborator

Did img_scale=(1024, 1024) works? and its normal to have a AP=0.000 on train_cpm_stage. You need also to generate pseudo labels and train detectors.

when img_scale=(1024, 1024),=======>>>> IndexError: index 184 is out of bounds for dimension 0 with size 184.

img_scale=(1024, 1024) doesn't work, but as (1024, 1024)--->(512, 512)--->(256, 256)--->(128, 128)--->(64, 64), until (64, 64) code will work.

but (64, 64) is too small for an image. The error is in visualization code, and what is the "max_probs" in this setting. It should be [h/self.stride[0], w/self.stride[0]] (where self.stride[0] is 4 in setting).

Isn't Training CPM the first step? This is the step where this error occurs.

It is normal that AP=0 during the phase of CPM training because the network don't know the boundary of the object. Only when using pseudo label to train detector, the AP is not zero.

@douling843
Copy link
Author

Did img_scale=(1024, 1024) works? and its normal to have a AP=0.000 on train_cpm_stage. You need also to generate pseudo labels and train detectors.

when img_scale=(1024, 1024),=======>>>> IndexError: index 184 is out of bounds for dimension 0 with size 184.
img_scale=(1024, 1024) doesn't work, but as (1024, 1024)--->(512, 512)--->(256, 256)--->(128, 128)--->(64, 64), until (64, 64) code will work.

but (64, 64) is too small for an image. The error is in visualization code, and what is the "max_probs" in this setting. It should be [h/self.stride[0], w/self.stride[0]] (where self.stride[0] is 4 in setting).

Isn't Training CPM the first step? This is the step where this error occurs.

It is normal that AP=0 during the phase of CPM training because the network don't know the boundary of the object. Only when using pseudo label to train detector, the AP is not zero.

Thanks for your reply, but where should I find the visualisation code, ‘max_probs’ appears in cpm_head.py and I don't find [h/self.stride[0], w/self.stride[0]].

@taugeren
Copy link
Collaborator

Line 100~146 in "mmrotate/models/dense_heads/cpm_head.py"

@douling843
Copy link
Author

Line 100~146 in "mmrotate/models/dense_heads/cpm_head.py"

max_probs in cpm_head.py is a variable or matrix, we are not forced to make its dimensions, and I don't understand where you say [h/self.stride[0], w/self.stride[0]]?

@taugeren
Copy link
Collaborator

taugeren commented Oct 13, 2024

File "/workspace/PointOBB-v2/mmrotate/models/dense_heads/cpm_head.py", line 116, in get_mask_image
if max_probs[i, j] > thr:
IndexError: index 120 is out of bounds for dimension 0 with size 120

The error report position, the max_probs, should be [h/self.stride[0], w/self.stride[0]], and it is [256, 256] in the DOTA setting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants