Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replacing Algebra.Graph by Maybe (Algebra.Graph.NonEmpty) #90

Open
nobrakal opened this issue Jul 3, 2018 · 13 comments
Open

Replacing Algebra.Graph by Maybe (Algebra.Graph.NonEmpty) #90

nobrakal opened this issue Jul 3, 2018 · 13 comments

Comments

@nobrakal
Copy link
Contributor

nobrakal commented Jul 3, 2018

Hi,

I have been working on a different implementation of Algebra.Graph, based on Algebra.Graph.NonEmpty.

The idea is to have:

newtype Graph a = G (Maybe (NonEmptyGraph a))

Using patterns, with:

pattern Empty :: Graph a
pattern Empty = G Nothing

pattern Vertex :: a -> Graph a
pattern Vertex a = G (Just (N.Vertex a))

pattern Overlay :: NonEmptyGraph a -> NonEmptyGraph a -> Graph a
pattern Overlay a b = G (Just (N.Overlay a b))

pattern Connect :: NonEmptyGraph a -> NonEmptyGraph a -> Graph a
pattern Connect a b = G (Just (N.Connect a b))

One can reproduce the traditional Algebra.Graph interface.

This big change can be summarized with some functions:

overlay :: Graph a -> Graph a -> Graph a
overlay l@(G a) r@(G b) = maybe r (\x -> maybe l (Overlay x) b) a

We now check the neutrality of empty directly into the overlay and connect functions.

edges :: [(a, a)] -> Graph a
edges = G . fmap N.edges1 . NL.nonEmpty

edges become more complicated and slower.

foldg :: b -> (a -> b) -> (b -> b -> b) -> (b -> b -> b) -> Graph a -> b
foldg e v o c (G g) = maybe e (N.foldg1 v o c) g

foldg is directly based on foldg1 (and this is the case for others functions, when possible).

removeEdge :: Eq a => a -> a -> Graph a -> Graph a
removeEdge s t (G g) = G $ N.removeEdge s t <$> g

fmap allow a pretty syntax when a function will act as the identity on the empty graph.

Some benefits

  • Type safety in the inner representation: one can be sure when pattern matching Overlay or Connect that the two graphs are both non-empty.
  • Faster graph inspection. Because of the dropped Empty constructor, foldg1 is faster than foldg. Thus, inspection functions are faster.

Some problems:

  • The creation takes longer from a list of edges, or anything, because we pattern match to detect the empty graph.
  • Graph modification takes longer, mainly this is affectinginduce because we have to wrap the intermediate results into the Maybe monad.
  • Old haddock (for ghc 7.8) have problems with patterns.

Benchs:

Using the pretty little script I made (on my own computer with graph with 1000 vertices):

isEmpty: 0.63 (good)
vertexList: 0.98 (OK)
vertexCount: 1.01 (OK)
hasVertex: 0.85 (good)
edgeCount: 0.91 (OK)
edgeList: 0.74 (good)
hasEdge: 0.45 (good)
addEdge: 1.22 (bad)
addVertex: 1.06 (OK)
removeVertex: 1.25 (bad)
equality: 1.01 (OK)
removeEdge: 1.00 (OK)
transpose: 0.84 (good)
creation: 1.29 (bad)

So for example addEdge was so fast, that there is drop here when we pattern match to see if it the previous gaph was empty or not.

(Note that I have sneaked the improvement on hasEdge of #88 (but it is even better here)).

The benchmarked code is here: https://github.com/nobrakal/alga/blob/MaybeNonEmpty/src/Algebra/Graph.hs

@snowleopard
Copy link
Owner

@nobrakal Many thanks for the experiment!

I have been wondering recently: do we even need graphs that can be empty? Presumably, non-empty graphs cover 99% of use-cases and are generally safer not only in terms of internal representation, but for the users too.

The creation takes longer from a list of edges, or anything, because we pattern match to detect the empty graph.

I don't understand this: it's just a single comparison, right? It shouldn't even be measurable.

@nobrakal
Copy link
Contributor Author

nobrakal commented Jul 4, 2018

I have been wondering recently: do we even need graphs that can be empty?

With the actual Empty constructor, I don't think so. But providing an encapsulated version using Maybe NonEmptyGraph:

  • allow to have "regular" functions (edges take a list, and not an NonEmpty one)
  • is coherent with functions from Algebra.Graph.NonEmpty (like induce1)
  • Will provide all the handy functions dealing with empty graphs, so a user who must deals with them have not to rewrite them, even if they are really dumb (like `isEmpty (G g) = isNothing g).
  • introduce the user to NonEmptyGraph, it may even worth a adding a mention saying that NonEmptyGraph is faster in all cases, and if they are dealing with nonEmptyOne

But I think we definitely need to focus more in NonEmpty :)

I don't understand this: it's just a single comparison, right? It shouldn't even be measurable.

I don't understand yet too... It is a huge drop for a single comparison !

@snowleopard
Copy link
Owner

I don't understand yet too... It is a huge drop for a single comparison !

Perhaps, it prevents some important optimisation? Could you investigate by looking at generated Core?

@nobrakal
Copy link
Contributor Author

nobrakal commented Jul 4, 2018

Perhaps, it prevents some important optimisation?

It was the case for edges. Habitually, there is a merge between a foldr and map (this is what we used to speedup overlays), and after some search, I found that it didn't happened here.

I fixed it by hand by writing:

edges :: [(a, a)] -> Graph a
edges [] = Empty
edges (x:xs) = G $ Just $ foldr (N.Overlay . uncurry N.edge) (uncurry N.edge x) xs

It dropped to ratio to 1.09, so it is good. I don't know if we can do better

@snowleopard
Copy link
Owner

@nobrakal Why not use edges1 in the second case? Then you don't need to reimplement it.

@nobrakal
Copy link
Contributor Author

nobrakal commented Jul 4, 2018

In fact I need to do the change in edges1 and use it from Algebra.Graph.edges.

edges1 (x :| xs) = foldr (Overlay . uncurry edge) (uncurry edge x) xs

The problem is that the combination of foldr1 and fmap (uncurry edge) does not behave very well ^^

@nobrakal
Copy link
Contributor Author

nobrakal commented Jul 6, 2018

The last benchs:

isEmpty: 0.63 (good) 
vertexList: 1.06 (OK)
VertexCount: 1.07 (OK) 
hasVertex: 5.13 (bad) 
edgeCount: 1.09 (OK) 
edgeList: 0.76 (good) 
hasEdge: 0.40 (good) 
hasSelfLoop: 1.17 (bad) 
addEdge: 1.08 (OK) 
addVertex: 1.07 (OK) 
removeVertex: 1.15 (bad) 
equality: 1.06 (OK)
removeEdge: 1.04 (OK) 
transpose: 1.01 (OK) 
creation: 0.97 (OK)

So everything is ok, note that the rough number for hasVertex is a false positive: it search for the vertex 0 which is at the end of the graph in the new representation, and at the begining for the actual one, so the times are not the same (since || looks first for the left side)

@nobrakal nobrakal closed this as completed Jul 6, 2018
@nobrakal nobrakal reopened this Jul 6, 2018
@snowleopard
Copy link
Owner

note that the rough number for hasVertex is a false positive

Hmm, this makes me think that the benchmarking suite is too fragile: it shouldn't measure performance by looking up only a single vertex. It should average time for several different vertices, otherwise some libraries/implementations will get an unfair advantage by chance.

@nobrakal
Copy link
Contributor Author

nobrakal commented Jul 6, 2018

Sorry, I wasn't clear. I actually test 4/5 vertices (for a graph with n vertices I test the search of vertex 0, n/3, 2*n/3, n+1).
But the difference is actually very huge for the first one (50 ns for the actual Alga but 3 micro seconds for the new one (3 micro seconds it the time taken by both for searching a vertex at the end of the graph )).

It might worth it to add some vertices, but I don't think it will really change the results ^^

@snowleopard
Copy link
Owner

@nobrakal Ah, I see, thanks for clarifying... Yes, that's what one gets for having O(n) complexity :) There is really a huge swing between best and worst cases.

@snowleopard
Copy link
Owner

Since we are now starting to look more carefully at non-empty graphs, I think it may be worth adding them into the regression suite... Perhaps, into a separate Travis instance, so we don't hit the time limit?

@nobrakal
Copy link
Contributor Author

nobrakal commented Jul 8, 2018

Yeah indeed ^^.

Perhaps, into a separate Travis instance

I totally agree, I will send a PR

@snowleopard
Copy link
Owner

snowleopard commented Dec 27, 2018

I just realised that NonEmpty.Graph (Maybe a) is isomorphic to Graph a:

  • NonEmpty.Vertex (Just a) ~ Vertex a
  • NonEmpty.Vertex Nothing ~ Empty

Not sure this observartion is useful, but I thought I better record it here.

I still think Maybe (NonEmpty.Graph a) is a better representation for possibly empty graphs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants