-
Notifications
You must be signed in to change notification settings - Fork 133
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Scaling the ground truth flow? #34
Comments
I have the exact same question.
but I don't understand the reason when computing the ground truth flow in loss function, the necessity of dividing the ground truth flow by the
Any help is welcomed! Thank you! |
Yep, I have the same question. So, I thought it is clearly wrong in this repo at this point ~ |
Sorry, It's my bad, I thought it is ok, NOT A WRONG IMPLEMENTATION ! Apology for the wrong comment above. |
Hello Mr Ferriere,
thank you alot for sharing your tensorflow implementation of PWC Net. Currently I am using it as a starting point for my thesis. However I'm wondering about the scaling factors you used for the groundtruth/predicted flow and I think there might be a mistake in your implementation.
In the paper it reads:
For me this means the following two things:
First, if you devide the ground truth flow by 20, then the predicted flow (in each level) will be around 20 times too small. Therefore, to get the real flow values, you have to multiply the predicted flow by 20. Particularly, if you do some kind of warping operation, the predicted flow has to be rescaled in advance.
Secondly, in order to get the supervision signal for each level, you have to downsample the ground truth flow to the same height and width as the predicted flow. If you don't further scale the ground truth flow after downsampling (what is proposed by the paper), its magnitude will be too large and so will be the predicted flow at that level. That's why, before warping the feature maps, you have to divide the predicted flow by a factor of 2^lvl.
In your implementation you (correctly) account for that with the following lines:
But what about the supervision signal?
If I'm correct you would have to divide the ground truth flow by a factor of 20. Otherwise the magnitude of the predicted (learned) flow will be around 20 times too large after multiplying it with the "scaler". In this case the warping won't do what it should. Now I'm wondering where you downscale the ground truth flow by 20?
Additionaly, in your
pwcnet_loss
function you downsample and downscale the supervision signal.So, in the second line you divide the magnitude of the ground truth flow by 2^lvl. As far as I can see it, this is not correct, if you also rescale the predicted flow by multiplying it with the "scaler" before the warping operation. To be more precise, because of your loss function, the network learns to predict a flow, which in each level is 2^lvl smaller than the original flow. It therefore already has the correct magnitude for the height/width of that level. When multiplying it with the scaler, you divide it again by 2^lvl. So the magnitude of the flow is too small and the warping will be wrong again.
I hope that my explanation is somewhat understandable. Thanks alot for taking some time to think about it and maybe share your thoughts on my points.
Best, Joshua
The text was updated successfully, but these errors were encountered: