-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sparse matrices for BIG distance matrices #16
Comments
Hi Leo, I'd love to hear more about your thinking here since this is the biggest computational bottleneck in our current use of MBL. I wouldn't consider spectral data sparse in the conventional sense of a sparse matrix but admittedly my understanding of sparse matrices is very limited. Cheers, Jon |
Hi Leo, |
Hi Jon, Hi Philipp, |
Hi Leo, Hi Jon, |
maybe another option would be to use {disk.frame} when random-access memory is sparse. This would mean that distance data is written out onto disk and chunked. |
Leonardo is right that we can use sparse matrices for this. For example, if you only want 200 neighbours from a dataset of say, 30,000 points, why storing the distance between 30,000 points? I guess it would dramatically reduce the size of the distance matrix. Alexandre |
When working with a huge number of observations, conventional computation of distance matrices might become problematic in terms of memory. Sparse matrices seem a simple and efficient alternative to approach this problem
The text was updated successfully, but these errors were encountered: