-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Readability refactor + normalize classification loss #252
base: master
Are you sure you want to change the base?
Conversation
Good job on the refactor. I don't think it's not necessary to do normalization. Loss from all anchors divided by the number of the positive anchors. I think it means the cls_loss per positive anchor, it's a mean loss already, right? |
Thanks.
Note that in the custom dataset I'm working on, I consider images with many annotations each - it is probably what exacerbates the issue. |
No, it's just dataset issue or improper parameters. My cls loss is only 0.25. |
Do you agree that with proper normalization the loss should be bounded regardless of the dataset?
|
I agree, but it's already done. Here. |
@zylo117 see new commit. There was exploding classification loss by zero division there (saw other commenting on that as well). Also note that in the original implementation the targets below the unmatched threshold (0.5 there unlike 0.4 here) are considered negative, while here they are ignored. |
@ggaziv , are you sure your code is correct? In @zylo117 's implementation target can have 3 different values:
So for with
You would ignore anchors with IoU < 0.4, which is an undesired behavior IMO. With
we get what we need and the comment here (# Ignore targets with overlap lower than unmatched_threshold) is wrong. BTW. Where did you find in the original implementation that the targets below the 0.5 threshold are considered as negatives? |
@dkoguciuk you are right re the implementation inversion - your legend of the target values helped correcting this (see updated fix). BTW. Where did you find in the original implementation that the targets below the 0.5 threshold are considered as negatives? |
@ggaziv, thanks for the link! This is the docstring from
And it's invocation:
So, as far as I understand - this is the exact @zylo117's implementation, or did I miss something? Nevertheless, the PR is still useful, and yes, the code LGTM 👍 |
I've just tested on this branch on shape dataset. Low loss, low mAP too. But the refactor part seems ok. |
@zylo117 I suggest you try again with the latest which I just forced pushed (after @dkoguciuk feedback). It should currently not alter the behavior at all other than avoiding zero division (causing classification loss to explode). Net-net this PR extends code utility and adds a minor fix w.r.t official code. |
I still can't get a normal result.
And the cls loss is at least 3 times higher than this master branch's. |
I didn't try but from looking at the code it should be exactly the same as master other than +1 to num_positive_anchors. I suggest just remove the +1 to verify that it is indeed functionally identical. Just note that if you are running directly on my branch it is lagged w.r.t current master so only consider the PR changes. |
Hello, Thanks for all the great work you've done @zylo117! I am having a similar problem. Has this been resolved now in the latest master branch? I am training on my custom dataset with three classes which I have already trained on with the signatrix implementation and the training was very stable with the loss decreasing consistently from the beginning. However, it doesn't support using different backbone versions so I switched to this repo. I am getting a huge classification loss oscillating between 3000-40000 and doesn't seem to stabilise or decrease. Here is the command I am running: I have tried with both d0 and d4 pretrained weights. Any help would be much appreciated! |
Loss won't decrease with that low lr, try 1e-4 for the first few epochs. |
Current classification loss is not normalized, enabling it to dominate the total loss.