-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimization for many calls #2
Comments
I think cythonizing Pythia might be optimizing at the wrong level. We use freud as the "engine" for the exact reason to speed up the nearest-neighbor calculations etc. Pythia -- as I interpret it -- just combines and reformulates all of these data as specific descriptors. I believe that cythonizing will make it harder to contribute and maintain this package so we should be careful where exactly we optimize. I'd rather adjust the freud API to make it easier to use freud efficiently if needed. |
@csadorf That's actually what I was suggesting. Leave all the code optimization in cython to freud. But more efficiently use the calls and reuse objects. I haven't looked at it enough, but a brief look showed a lot of object creation inside loops. I'm not sure exactly how inefficient this is and if there's any form of magical caching that happens automatically with cython which would make this not a negative. I'll perhaps do some profiling on some previous code I ran that I expected to run much faster and see how valid these suggestions might be. But I think there should be a lot that's possible here. |
I agree that adding cython would significantly increase the installation complexity and development overhead of the package. I would want to see compelling benchmarks for common use cases that currently run too slowly before being convinced that it's worth the trouble. In some cases (I recall that the voronoi descriptors are particularly slow), perhaps it makes sense to move the core of some algorithms into freud proper? |
@j-proc Is there a set of descriptors you are particularly interested in using? @klarh It doesn't surprise me that Voronoi-based descriptors are among the slowest - freud does not perform its own Voronoi calculations, but instead clones particles across the periodic boundaries to ensure proper periodicity (increasing the problem size substantially) and calls out to scipy in a way that is perhaps less than efficient and likely not parallelized. I have been working with a different set of Voronoi-based descriptors that is not yet implemented in pythia, and have also observed that this is expensive. |
I'm not sure whether we are doing this yet, but a good start might be to create a test suite that does some benchmarking (and optionally profiling) so that we can reliably identify hot spots and potential regressions. |
I could make that my first half of the hackathon. If I don't get to it before then. Just setting up some general benchmarking and profiling. I could solicit some code from the group on general freud scripts that people expect to run faster and perhaps don't. I expect a lot of this will come down to inefficient coding, but it would be good to know I think where the speed of everything slows down. (Off topic for the package but related to the discussion) |
@bdice I was particularly thinking that the descriptors where a nearest neighbor range are used. These seemed particularly slow and I suspect are generating new neighborlists for each count. |
@j-proc I searched the codebase for
|
@j-proc The NearestNeighbors class in freud is using LinkCell internally and can be very slow in heterogeneous systems or when the LinkCell cell size ( In general, having a good benchmark suite for both freud and pythia would be extremely helpful. |
@bdice Apologies, I think I'm just inserting what I expect the common usage would be there. You're correct. I'm honestly not sure what all can be done. So Simon makes a very good point about doing the profiling first to see if there's actually much to do here. In the use cases I'd done in the past, it seemed like there were. But considering it's open source, one could always rewrite the functions to be faster for specific use cases. |
Another specific optimization that could be tested for Lines 11 to 17 in f401f45
|
Pythia mostly contains calls to freud, so it does not seem that there would be substantial benefits to cythonizing it. However, there are places where freud objects are regenerated rather than reused and it seems that many of the append and concatenate's could be replaced by direct numpy allocation and assignment.
This would require replacing many functions with classes that perform the same feat but store the freud objects and perhaps intelligently decide when to regenerate (based on box changes?).
Interested in other opinions on what else could speed it up or what API people would like to see used.
The text was updated successfully, but these errors were encountered: