-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance Issues #72
Comments
I don't really know how to deal with it in this package though. I'm also leaning towards many other major changes (such as replacing Graphs with LightGraphs, modeling OSM elements more faithfully, and interoperability with DataFrames). Would you prefer for me to get the other package up to speed with most of the functionality here before dropping it in as a replacement, or to maintain a separate fork going forward? |
That all sounds great to me -- I'd guess with the major switches, it could On Thursday, September 24, 2015, Yeesian Ng [email protected]
|
I feel like I missed something big here. Are you planning to replace this package entirely? If there is a faster parser available, why not just incorporate it here instead? I can't see anyone arguing against a faster parser that just reads the nodes or ways. The same goes for Graphs vs. LightGraphs -- I think everyone's only preference would be towards something that works well. I guess my question is, why do you need to fork or start fresh? |
I'm on the road, so I haven't followed the link, but it sounded to me like On Thursday, September 24, 2015, Ted Steiner [email protected]
|
Sorry I was in a seminar. No, I don't have any intentions of replacing this package by registering a different package in METADATA. Any changes I'd like to see will still come in incremental PRs to this package. But I did need to start afresh, for the reasons sean mentioned: to experiment with a smaller core, and see how changes (that are beyond the scope of a PR) would work out in practice. It will be really draining to do multiple major-but-incremental PRs (I have tried) that change both the parsers and the object models simultaneously, while (i) making sure everything else works, (ii) demonstrating why the changes might be necessary, and (iii) maintaining a tidy commit history. |
There won't be any dataframes operations in the core openstreetmap package. The OSM package/objects should be designed only to reflect the nature of OSM objects, rather than providing sophisticated data operations. The allusion to dataframes is only to allow users to export/convert OSM objects to dataframes (for functionality that OSM will not provide) if they so desire. |
Oh, I misread. That makes sense. On Thursday, September 24, 2015, Yeesian Ng [email protected]
|
Cool, I'm all for it. Thanks for explaining, @yeesian! I was just a little thrown off by some of the text in the scope section your readme. If there's anything I can do to help make your life easier just let me know! |
I'm back in front of a computer, and after reading @yeesian's readme, I agree that the package structure he outlines is what we should have in the long term. Since I think @tedsteiner wants this package to remain a one-stop shop, which makes sense to have, I think that, once the design (apis and boundaries) settles (long term), making those individual packages offshoots / dependencies of this package, and registering those individual packages (for those that aren't served by always having graphing + plotting + ... dependencies in production) is the right goal. |
Sounds good to me. |
Let me know how I can help with LightGraphs. Specifically, if there's functionality you need that doesn't exist, or you have questions relating to existing functionality, please feel free to open up an issue there so we can discuss the best way to proceed. |
So I've got stuff working (with LightGraphs) and running on OpenStreetMapParser, and am using that for my research at the moment. I'm really busy at the moment, and won't have time to start doing PRs to this package for awhile though |
I've found the streaming parser to be too slow, mostly because it's maintaining a complicated state machine, and generating lots of objects during the parsing.
I think it's better to just slurp in the data, first, before dealing with the logic within julia. Here's an example:
With this package:
Under my proposal (implemented here):
The differences become significant when processing the following osm file in julia v0.4:
Before:
After:
The text was updated successfully, but these errors were encountered: