Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RetinaNet model.fit fails if it encounters a training input with zero boxes #2488

Open
thomas-coldwell opened this issue Sep 6, 2024 · 2 comments
Assignees

Comments

@thomas-coldwell
Copy link
Contributor

Current Behavior:

We are currently utilising the RetinaNet pretrained model which we are then transfer learning onto a custom dataset. In this dataset we have an adapted jittered resize layer that checks after performing the crop if at least X% of the area of the original bounding box is left in the image otherwise it will then exclude this bounding box (we are using #2484 to achieve this). This works great in isolation when we are testing with the jittered resize demo in Keras CV, however, during training (specifically at the end of the first epoch when it goes to run the validation) it then fails with an out of bounds exception e.g. indices[1,53275] = 0 is not in [0, 0) [[{{node retina_net_label_encoder_1/GatherV2_1}}]] (I've attached the full stack trace below)

stacktrace.txt

From what I can understand of the stack trace it specifically fails here:

matched_gt_cls_ids = target_gather._target_gather(

So I think somehow this gather function might not work if the input has zero elements as would be the case here. If I set the minimum_box_area_ratio to 0% (so it doesn't exclude anything) then it trains normally as before but it just seems to be that setting this to anything non-zero will then prune some boxes but if there is any training example with zero then it causes this exception.

Expected Behavior:

Should be able to pass in training examples to the RetinaNet model with zero boxes in and it should continue training regardless. Or maybe there is then a mechanism to skip training examples without any labelled boxes present

Steps To Reproduce:

  1. Apply the changes linked for the adapted jittered resize (its a very minor change that adds to the bounding_box.clip_to_image function)
  2. Create a jittered resize layer and set the minimum_box_area_ratio to say 0.5
  3. Then attempt to train the RetinaNet model with this JitteredResize acting on the training dataset
  4. Observe the same exception

Version:

Latest off of master

Anything else:

@thomas-coldwell
Copy link
Contributor Author

Just to add to this I've tried the following alternative approach too:

  • We could drop / filter any inputs that do not have any bounding boxes at the end of the augmentations, however, you could then end up with a completely empty batch or just an odd sized batch for each training step
  • Additionally, there doesn't seem to be any nice way with Tensorflow to filter these out - maybe some kind of tf.reduce_any and a tf.binary_mask could be used to remove the images and bounding boxes (boxes+labels) from the input dict

@sachinprasadhs
Copy link
Collaborator

Thanks for reporting the issue.

RetinaNet with the new implementation is now part of KerasHub a consolidated KerasNLP and KerasCV package.
The model weights are available in Kaggle, for details and usage refer https://www.kaggle.com/models/keras/retinanet

We will not be making any changes to the models in KerasCV which are made available in KerasHub or the APIs/Utils made available in Keras.
If you still face any issue, please file a new issue in keras-hub

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants