-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
16-bit segmentation mask #8
Comments
Same issue here |
Thank you for your nice comment and interest in our work! These 16-bits images represent the "semantic foreground" output probability per pixel with 16-bit precision. To be more precise, the semantic segmentation network PSPNet trained on the ADE20K dataset outputs a vector containing 150 real numbers for each pixel, where each number is associated to a particular object class within a set of 150 mutually exclusive classes. The semantic probability estimate is computed by applying a softmax function to this vector and summing the values obtained for classes that belong to a subset of classes that are relevant for motion detection. We use the subset: person, car, cushion, box, boot, boat, bus, truck, bottle, van, bag and bicycle, whose elements correspond to moving objects of the CDNet 2014 dataset. Now it is not mandatory to work at this exact precision. 64-32 or even 8 bit precision would work similarly and you could choose a different set of classes of interest. Let me know if this is clearer now and don't hesitate if you have any extra question. |
Hi @cioppaanthony , Thank you so much for your reply! I do have some extra questions:
|
Hi @Swayzzu,
|
Thank you! That's very helpful! |
Amazing work!
The results from the datasets that you shared are excellent. Thanks for sharing!
One question: I am trying to run RT-SBS on my collected videos containing a person. How can I generate the 16-bit PSPNet segmentation mask for these videos? I'm particularly interested in the
person
class.The text was updated successfully, but these errors were encountered: