-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make loading features from storage robust to order. #9
base: main
Are you sure you want to change the base?
Conversation
The current `load_features` implementation relies on features from each video (same video_id) being in a contiguous block. This matches how `store_features` organizes feature files. Update `load_features` to accept descriptors in any order by sorting by video_id (then by start timestamp) before constructing `VideoFeature` structures. Also change `store_features` to sort by video_id before storing features.
restored = load_features(f.name) | ||
|
||
features.sort(key=lambda x: x.video_id) | ||
restored.sort(key=lambda x: x.video_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we be testing that restored
is already properly sorted when loading with load_features
? I'm not sure we should sort it here.
For the sake of completeness, we also tracked down the reason we believe the memory error was caused. Line 60 in 5d8af86
In vsc2022/vsc/descriptor_eval_lib.py Line 39 in 3afe07a
The resulting calculated number of query candidates to generate for a given input query descriptor is then more than an order of magnitude larger than we intend. When we exhaustively search for and return this number of candidates in our exponential iterator, we return increasingly large copies of matrices until we run out of memory. |
Hi @edpizzi! Thank you for your pull request. We require contributors to sign our Contributor License Agreement, and yours needs attention. You currently have a record in our system, but the CLA is no longer valid, and will need to be resubmitted. ProcessIn order for us to review and merge your suggested changes, please sign at https://code.facebook.com/cla. If you are contributing on behalf of someone else (eg your employer), the individual CLA may not be sufficient and your employer may need to sign the corporate CLA. Once the CLA is signed, our tooling will perform checks and validations. Afterwards, the pull request will be tagged with If you have received this in error or have any questions, please contact us at [email protected]. Thanks! |
The current
load_features
implementation relies on features from each video (same video_id) being in a contiguous block. This matches howstore_features
organizes feature files.Update
load_features
to accept descriptors in any order by sorting by video_id (then by start timestamp) before constructingVideoFeature
structures. Also changestore_features
to sort by video_id before storing features.