-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
error in the documentation of the image embedding sizes ? #60
Comments
while enumerating the configurations and checking all forward fine, I am also observing an issue I didn't have so far ... if I use the same (model_image,model_audio) and iteratively forward through 1000+ pairs of (image,audio) the compute time doesn't increase, but if I switch the model configurations it seems that it grows big in as few as 12 steps (at the end it takes several seconds to compute the embeddings vs ~100ms at the start) |
Good catch! We'll fix the documentation in the next release. Regarding the slowdown, my guess is that it could be an issue with memory if TensorFlow isn't properly garbage collecting the old models and corresponding computational graphs. Perhaps you could try running tf.keras.backend.clear_session() at the end of the loop and see if that helps. Let us know if that helps! |
The documentation should be correct now, addressed in #72 Feel free to let us know if the slowdown was resolved! |
Hi,
In the API documentation it is written that openl3.models.load_image_embedding_model accepts 6144 and 512 as embedding_size. It seems that it is not the same as openl3.models.load_audio_embedding_model and that in fact it accepts 8192 and 512 as sizes.
It does not seem specified in your paper, shall I assume that the (image,audio) models have been trained as pairs of embedding sizes (8192,6144) and (512,512) for the different configurations of input_repr and content_type ?
Thanks !
The text was updated successfully, but these errors were encountered: