Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nested loops not scalable #1

Open
choltz95 opened this issue Nov 9, 2021 · 1 comment
Open

Nested loops not scalable #1

choltz95 opened this issue Nov 9, 2021 · 1 comment

Comments

@choltz95
Copy link

choltz95 commented Nov 9, 2021

Thank you very much for releasing the code for your paper!

Not serious, but I encounter some scalability issues at pairwise distance computation for > 10k points due to create_distance_matrix graph_utils.py L8.

Using pdist resolved my issue. Could also do numpy vectorize/broadcasting/einsum to avoid sp dependency.
Another "nice-to-have" would be preservation of kernel sparsity (i.e. no dense nxn matrices in memory).

@shekkizh
Copy link
Collaborator

shekkizh commented Nov 9, 2021

Hi @choltz95
Thanks for pointing out the scalability issue. The nnk_demo.py code was written as a proof of concept for visualizing the graphs obtained with NNK vs KNN.

For large-scale experiments, I would suggest using the nnk function API in faiss_nnk_neighbors.py or if optimality is not crucial, the neighborhood definition from approximate_nnk folder.
Note that, these functions, however, require installing faiss package.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants