-
Notifications
You must be signed in to change notification settings - Fork 255
Speed up from_graph_tool and clean up a little #403
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
When converting from networkx multigraphs, would it make sense to preserve the edge keys in an extra edge attribute, to make full round-tripping less lossy? This is important for some use cases, namely when we need to know which networkx edge corresponds to which igraph edge. E.g., when computing weighted shortest paths, it will be precisely one (and not the other) edge that is included in the path. I guess that those who need this can already manually copy keys to an edge attribute on the networkx side. But perhaps it's nicer to do it automatically? If we choose to do it, it would be nice to test it out through some practical use cases before settling on a solution. |
Thank you both. I've now gotten rid of the I had thought about the multi-edge problem a little, but couldn't quite come up with a clean way, so I'd be happy to hear your opinions. I think that in vertex1 -> 'hello'
vertex 2 -> 'ciao'
('hello', 'ciao') in g.edges -> True or False, depending if there's an edge there For So it seems to me that to store that information, we could get away with storing only the "secret index" of each edge iff the networkx structure is a Would you like me to add this feature into a property called "_nx_multiedge_index", in analogy with the current "_nx_name" for vertices? |
Sounds like a good idea -- however, it's not urgent and I'd like to merge this PR now so it does not deviate from the original topic. If you feel like you can put some time aside to work on the multi-edge issue, I very much appreciate the contribution, and please send a separate PR where we deal with it. |
yep sounds good, let's also hear from Szabolcs and the others in case they have opinions |
I never meant it to be added in this PR, just brought up the topic. This is not an "index" (i.e. it is not a number) but a "key". "Key" is the term networkx uses. It is also an arbitrary hashable. By default, networkx uses integers, starting from zero for the respective pair of vertices. Useful doc links I found:
This is fundamental difference in how igraph and networkx handle multi-edges. (Mathematica does the same as networkx, but IMO in a rather flawed way.) igraph identifies edges by their numerical index, which is global to the graph. Thus, igraph has a guaranteed edge ordering. networkx identifies edges by their endpoints (as Fabio pointed out). This is insufficient to distinguish multi-edges (and be able to e.g. give them distinct attributes). To get around this, networkx adds an extra key, so Currently, in I agree with Fabio's proposal, except for the name. I suggest "_nx_edge_key", or if you like, "_nx_multiedge_key". The terminology that networkx itself uses is "edge key". This is also shorter to type, so I prefer it. On the other hand, "multiedge key" better describes what this key really is: it belongs not to the graph, but to the specific vertex pair (and must be unique for the vertex pair, not for the graph). Additionally, before fully implementing this, we should try to test on real use cases. We should at least try a few small computations and make sure things work smoothly. Things to consider:
(Side note: I notice that |
All this is very interesting and indeed the topic of the pilot code in my own PR on transparent API. How to return data types for the practical examples you mention etc. Indeed, what you are describing is a transparent call to the igraph API from -> to a networkx object. I'd be happy to implement the additional key but also think a full support for these tires of functions is to be the goal of that PR, with some generic design in place... |
I wrote a lot (= too much) in the above comment, here's the executive summary: Before arriving to a final decision, we must test our ideas on practical use cases. These must work well:
Specific proposals I made (and a few I didn't mention above):
|
Yes, that's correct, I'll change this in the main branch. |
I was just going to say the same. It would also be good to discuss it at one of the meetings, or to have a separate longer meeting where someone would help me fill the gaps in my Python knowledge, which must certainly show in my suggestions above :-)
If there's no way to move multiple message within GitHub, then that's better. |
I don't think there's a way to move comments in GitHub so let's do that then. FYI, 0c0ebd7 adds support for |
Let us continue on Discourse: |
Adding the graph-tool equivalent of #402, which originally addressed #401
I also edited @szhorvat 's code a little to avoid
defaultdict
. Although it's fine in principle, having an explicit list of keys makes the code easier to read I think.Same principle applied to the
from_graph_tool
function. Thanks Szabolcs again for #402.