Skip to content
This repository has been archived by the owner on Oct 21, 2021. It is now read-only.

Testing Branch (Naive edge/vertex removal) #87

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

tcfuji
Copy link

@tcfuji tcfuji commented May 8, 2014

@pozorvlak Here is the failed attempt at testing remove_edge!. Even though testing works here, it still does not work locally.

@tcfuji
Copy link
Author

tcfuji commented May 9, 2014

This may explain why the build is failing:

julia> g = simple_graph(5, is_directed=false)
Undirected Graph (5 vertices, 0 edges)

julia> add_edge!(g, 1, 2)
edge [1]: 1 -- 2

julia> add_edge!(g, 1, 3)
edge [2]: 1 -- 3

julia> add_edge!(g, 1, 4)
edge [3]: 1 -- 4

julia> add_edge!(g, 1, 5)
edge [4]: 1 -- 5

julia> add_edge!(g, 2, 5)
edge [5]: 2 -- 5

julia> add_edge!(g, 4, 5)
edge [6]: 4 -- 5

julia> g
Undirected Graph (5 vertices, 6 edges)

julia> edges(g)
6-element Array{Edge{Int64},1}:
 edge [1]: 1 -- 2
 edge [2]: 1 -- 3
 edge [3]: 1 -- 4
 edge [4]: 1 -- 5
 edge [5]: 2 -- 5
 edge [6]: 4 -- 5

julia> remove_edge!(g, 1, 3)
A edge [2]: 3 -- 1
B edge [2]: 3 -- 1
removed edge at index 1
C edge [1]: 2 -- 1
C edge [2]: 3 -- 1
removed edge at index 2

julia> remove_edge!(g, 1, 4)
A edge [3]: 4 -- 1
B edge [3]: 4 -- 1
removed edge at index 1
C edge [1]: 2 -- 1
C edge [3]: 4 -- 1
removed edge at index 2

julia> remove_edge!(g, 2, 5)
ERROR: BoundsError()

julia> edges(g)
4-element Array{Edge{Int64},1}:
 edge [1]: 1 -- 2
 edge [3]: 1 -- 4
 edge [5]: 2 -- 5
 edge [6]: 4 -- 5

julia> remove_edge!(g, 2, 5)
ERROR: BoundsError()



julia> remove_edge!(g, 4, 5)
ERROR: BoundsError()

julia> remove_edge!(g, 1, 2)
A edge [1]: 2 -- 1
B edge [1]: 2 -- 1
removed edge at index 1
C edge [1]: 2 -- 1
removed edge at index 1

julia> g
Undirected Graph (5 vertices, 3 edges)

julia> add_edge!(g, 3, 4)
edge [4]: 3 -- 4

julia> remove_edge!(g, 3, 4)
A edge [4]: 4 -- 3
B edge [4]: 4 -- 3
removed edge at index 1
C edge [4]: 4 -- 3
removed edge at index 1

julia> remove_edge!(g, 2, 5)
ERROR: BoundsError()

julia> edges(g)
3-element Array{Edge{Int64},1}:
 edge [3]: 1 -- 4
 edge [5]: 2 -- 5
 edge [6]: 4 -- 5

julia> edge = edges(g)[2]
edge [5]: 2 -- 5

julia> edge_index(edge, g)
5

julia> splice!(g.edges, edge_index(edge, g))
ERROR: BoundsError()

julia> length(g.edges)
3

In other words, the edge index does not change as the array of edges changes. Hence, it is possible that an edge will have an index that is greater than the length of the edge array. I think I can hack my way around this but I'm wondering if this is cause for enough concern that Graphs.jl should change the way it indexes edges. Any thoughts?

@pozorvlak
Copy link
Contributor

Ah, yes, that makes sense. Huh. I'll think about it, but I'll be busy moving house for the next few days so can't devote too much attention to it right now - sorry!

@tcfuji
Copy link
Author

tcfuji commented May 9, 2014

No worries. I need a break from this too. :)

For future reference:
If e in edges(g) == true, shouldn't it follow that revedge(e) in edges(g) == true for undirected graphs?

@tcfuji
Copy link
Author

tcfuji commented May 26, 2014

@pozorvlak Any headway regarding edge indexing?

@kmsquire
Copy link
Contributor

@TFGIT, @pozorvlak, I ran across a similar issue with deletes from OrderedDicts (in DataStructures.jl). Just a couple of thoughts:

  1. If it's okay for edge numbers to change, then you could just "unset" the entry, instead of deleting it, and periodically compact and renumber (that's how OrderedDicts handles it).
  2. As long as the list is maintained in sorted order, searchsortedfirst will give you O(n log n) lookups of the index, or a Dict will give you O(1) (with a large constant). Neither is as nice as direct indexing, of course, and especially for large graphs, you would really want something better. But it would work.

@lindahua
Copy link
Contributor

A graph needs additional mechanism to maintain the edge list (e.g. renumbering, internal hashing, etc) in order to support the functionality of removing edges. This would involve considerable costs, and is not something that every graph need to have.

I think a better approach is to introduce new graph types and only implement remove_edge over those graphs (and maybe we introduce a new interface name for them).

@tcfuji
Copy link
Author

tcfuji commented Aug 21, 2014

@lindahua From the user's perspective, the ability to remove or exclude edges/vertices is an essential part of graph construction. It would be strange for the default graph object to not have that kind of functionality. Perhaps the opposite would be more appropriate: let the default graph type have edge/vertex removal and introduce a "fast graph" type without the removal abilities.

I'm still willing to help out with this (I can't use the package without edge/vertex removal), but can you (or anyone) steer me to the right direction regarding making my PRs of acceptable quality? I'm not adept at optimizing Julia.

@sbromberger
Copy link
Contributor

I'd like to second @tcfuji on this. There should be (at least) a class of graph types that allows for removal of edges / vertices. Applications modeling dynamic graphs demand such functionality and it's present in other graph libraries.

If there are performance issues related to node/edge removal, then those should be explicitly documented, but there is a practical need for such behavior.

@timholy
Copy link
Contributor

timholy commented Dec 31, 2014

I suspect if somebody submits a pull request, others will chime in with recommendations about how to make it more efficient.

@sbromberger
Copy link
Contributor

Comments appreciated: https://gist.github.com/sbromberger/1dddcaed48bf0f487ec9

This works for SimpleGraph. It was much easier to regenerate a graph with the removed components than it was to try to reindex it on the fly.

Example:

g = simple_graph(4)
add_edge!(g,1,2); add_edge!(g,1,3); add_edge!(g,2,3); add_edge!(g,3,4)
dynamize!(g)
delete_vertex!(g,2)
reindex!(g)

(Edited to change the gist)

(Thanks to @mlubin for the idea of regenerating.)

@timholy
Copy link
Contributor

timholy commented Jan 2, 2015

I would call this "semi-dynamic"; in particular, if you have a work flow in which you (say) use a priority queue to make some decisions after each node/edge addition/deletion, then reindex! will be a bottleneck, because you'll have to regenerate the whole graph after making a single change. I'm not saying there isn't room for this kind of implementation, just pointing out the main weakness.

What about this kind of organization? (Note I haven't read through the Graphs.jl code in a long time, so this is surely not idiomatic.)

immutable G  # this is a "DynamicUndirectedGraph", but that was too long to type
    links::Array{Set{Int},1}
    deleted::Array{Bool, 1}
end

function add_edge!(g::G, i, j)
    push!(g.links[i], j)
    push!(g.links[j], i)
end

function delete_vertex!(g, i)
    for j in g.links[i]
        delete!(g.links[j], i)
    end
    empty!(g.links[i])
    g.deleted[i] = true
end

julia> g = G([Set{Int}() for i = 1:4], [false for i = 1:4])
G([Set{Int64}(),Set{Int64}(),Set{Int64}(),Set{Int64}()],Bool[false,false,false,false])

julia> add_edge!(g,1,2); add_edge!(g,1,3); add_edge!(g,2,3); add_edge!(g,3,4)
Set([3])

julia> g
G([Set([2,3]),Set([3,1]),Set([4,2,1]),Set([3])],Bool[false,false,false,false])

julia> delete_vertex!(g, 2)
true

julia> g
G([Set([3]),Set{Int64}(),Set([4,1]),Set([3])],Bool[false,true,false,false])

@timholy
Copy link
Contributor

timholy commented Jan 2, 2015

Note that this also allows an easy implementation of delete_edge!(g, i, j).

@sbromberger
Copy link
Contributor

I think one of the issues with this approach is that the graph still requires incidence lists in order to take advantage of things like dijkstra_shortest_paths - and there's a tradeoff between ease/speed of graph modification and ease/speed of generating the incidence list (among other attributes - there's no practical way of implementing edge_index in that example above, and that's a required interface).

Unless I'm completely wrong in my understanding of that code, in which case feel free to ignore.

ETA: I spent a fair amount of time on this yesterday, so wanted to get these thoughts down while they were still fresh: there seems to be no great way of creating dynamic graphs (those with removable vertices/edges) that doesn't have a tradeoff either in standard performance or in reindexing/removal.

In my particular case, I do one set of removals for every couple hundred thousand other operations, so it's better to have the removals (via reindex!) be expensive relative to the other operations. @timholy - you're right that "dynamic" may be a misnomer.

There are also hidden tradeoffs in the code above (and my code) - multigraphs are specifically disallowed. That is, you can create a multigraph, but a call to delete_edge! will remove all of them.

@timholy
Copy link
Contributor

timholy commented Jan 2, 2015

In my implementation the graph is undirected, so every outgoing link is also an incoming one. Thus, g.links[i] is the incidence list for vertex i.

To me it seems that vertex_index and edge_index are tightly coupled to a particular implementation (based on Vector containers); personally I'm skeptical that those should be considered first-class citizens.

@timholy
Copy link
Contributor

timholy commented Jan 2, 2015

All the rest of the functions in http://graphsjl-docs.readthedocs.org/en/latest/interface.html do look more generic, though.

@sbromberger
Copy link
Contributor

@timholy vertex_index is ok; but edge_index - which is used for dijkstra_shortest_paths, for example, is impossible with sets (unless there's some functionality I don't know about)*. @lindahua has already stated that we should be revisiting indexing, but after having tried several alternate implementations over the last couple of months, I don't know a better way of doing it (except for simplifying the graph structure entirely - see, e.g., #143.

*What made this particularly frustrating yesterday is that there was nothing in the documentation that described an additional constraint of edge_index - namely, that each index needed to be <= num_vertices. That undocumented limitation destroyed a clever bit of work I did using object_id to create unique edge indices.

@timholy
Copy link
Contributor

timholy commented Jan 2, 2015

It seems like edge_index is only used to retrieve edge_properties, and (while I don't understand the Graphs.jl code in detail) I don't see why one couldn't fetch edge properties based on Pair(i,j).

@timholy
Copy link
Contributor

timholy commented Jan 2, 2015

Note also that dijkstra_shortest_paths does not declare a dependency on the EdgeMap interface.

@sbromberger
Copy link
Contributor

Oy. You may be right there - it's begun to blur. ISTR there was some functionality that didn't work without edge_index, but perhaps I was mistaken that it was d_s_p. I'll try to remember correctly. d_s_p requires the same of vertex_index, however - namely, that each index be <= num_vertices. That's why object_id didn't work for vertices.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants