-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
About ablation study on memory mechanism #52
Comments
We just select several frames, feed them into LLM decoder without merge algorithm |
Got it. What's the difference between video_path and fragment_video_path? In my understanding, video_path is the path to the video to be processed. But in upload_video_without_audio function in chat_model.py, fragment_video_path is used as a parameter of load_video function. |
fragment_video_path stores the video clips read by the sliding window |
So need I prepare the video clips in advance or it will be generated automatically? |
no needs, it will be generated automatically |
Where does it generated? I don't find it. The first time fragment_video_path is used seems to be as a parameter of load_video in upload_video_without_audio function. |
you can run it and print the path to see:) |
because fragment_video_path needs to be a mp4 file, not a dictionary:) |
So fragment_video_path and video_path are the same video? |
no, fragment_video_path is a tmp mp4 file |
But I only have one video to be processed and you said that fragment_video_path will be generated. So I am confused... Could you give me a sample? |
It seems a bug in pypi code. In github code, the capture_video function write the tmp video file, and return the path. But in pypi code, the capture_video function does not write the tmp video file but still return the path. So the error above occurred. |
How is the model without the MM module implemented in the ablation experiment? Is it directly applying the merge algorithm to the entire video?
The text was updated successfully, but these errors were encountered: