Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Graphs: fix isCyclic Digraph #3628

Open
wants to merge 1 commit into
base: development
Choose a base branch
from

Conversation

taboege
Copy link
Contributor

@taboege taboege commented Jan 17, 2025

This fixes the detection of cycles in directed graphs (see issue #3626) by not relying on the numbers from DFS but by attempting a topological sort. It succeeds if and only if the graph is acyclic. This reuses existing code as much as possible.

Also added tests that document previous bugs which are now fixed.

This fixes the detection of cycles in directed graphs (see issue Macaulay2#3626) by
not relying on the numbers from DFS but by attempting a topological sort.
It succeeds if and only if the graph is acyclic. This reuses existing code
as much as possible.

Also added tests that document previous bugs which are now fixed.
@d-torrance
Copy link
Member

This change broke a test in the StatGraphs package:

/Users/runner/work/M2/M2/M2/Macaulay2/packages/StatGraphs.m2:1517:5-1523:3 error:
 --                 Graph => Graph{1 => {2, 4, 5}}
 --                                2 => {1, 3}
 --                                3 => {2, 4}
 --                                4 => {1, 3}
 --                                5 => {1}
 -- 
 -- o4 : MixedGraph
 -- 
 -- i5 : 	   assert(not isCyclic G)    
 -- stdio:6:17:(3): error: assertion failed
 -- 
stdio:1:5:(3): error: test(s) #13 of package StatGraphs failed.

@taboege
Copy link
Contributor Author

taboege commented Jan 17, 2025

Yes, I have been looking at that but I'm confused on multiple levels.

(1) When I step through the code below with the failing test no. 13:

for i from 0 to n - 1 do
(
for j from 0 to n - 1 do
(if not submatrix(adjacencyMatrix D, toList(set(allComp_i)*set(vertices D)), toList(set(allComp_j)*set(vertices D)))==0 then adjMG_(i,j)=1 else adjMG_(i,j)=0
));

A 7x7 adjacencyMatrix D is indexed using node labels of the graph as indices, but the node labels are {1, 2, 3, 4, 5, 7, 8} which triggers an array out of bounds error that I don't see when just calling check(StatGraphs).

(2) When I fix the matrix indexing issue and continue stepping through the code, it runs through and produces a result which contradicts the assertion in the test. The isCyclic method in StatGraphs in this case constructs the following matrix:

1 1 1 1
1 1 1 1
0 0 0 0
0 0 0 0

As an adjacency matrix of a directed graph, it exhibits many cycles. Even before my commit, isCyclic Digraph would detect that this is cyclic. So I don't know what causes the result to differ.

(3) For all I know, the error could be in my commit, the StatGraphs code or the StatGraphs assertion. I would have to locate the definition of cycle for mixed graphs to narrow it down.

@taboege
Copy link
Contributor Author

taboege commented Jan 17, 2025

Most of my confusion cleared right up after commenting. I overlooked this crucial line which causes the indexing problems not to appear:

G:=indexLabelGraph g;

@taboege
Copy link
Contributor Author

taboege commented Jan 17, 2025

Now I understand how the test result differs. On test no. 13, the isCyclic in StatGraphs constructs the following adjacency matrix of a directed graph:

1 1 0 0
0 0 0 0
0 0 0 1
0 0 0 0

Up until my commit, this graph was reported as being acyclic. My change made it cyclic. I would argue that the self-loop at the first node is a cycle (meaning that either the StatGraphs code or the StatGraphs test assertion is wrong). But I have to do some reading to see what the definition of cycles in this class of graphs is.

@taboege
Copy link
Contributor Author

taboege commented Jan 27, 2025

I have heard from one of the developers of StatGraphs that their main source is the paper of Lauritzen and Sadeghi cited prominently in the documentation. With this definition, I believe that the assertion of the failing test (which is also the first example in the documentation of isCyclic) is wrong. The mixed graph has undirected edges 1 -- 2 -- 3 and a directed edge 3 -> 1 back. According to page 4 of Lauritzen–Sadeghi, this is a cycle.

On a tangent, I am also not sure if the algorithm that decides isCyclic MixedGraph is correct. At least it does not match its documentation:

A directed cycle is a cycle in the Digraph constructed from a mixed graph G by identifying all connected components on bidirected and undirected edges. Such a connected component contains either bidirected edges only or undirected edges only.

What the implementation does is use the connected components of the bidirected and the undirected parts. But these components might still have edges of other types between them. Or maybe I'm misunderstanding the sentence?

It would be most helpful if one of the developers of the package could comment on the definition of cyclic being used and the intent of the code @lukeamendola @luisgarciapuente @olgakuznetsova @harshitmotwani2015

@olgakuznetsova
Copy link
Contributor

olgakuznetsova commented Jan 28, 2025 via email

@taboege
Copy link
Contributor Author

taboege commented Jan 28, 2025

As for the other question, we ignore all edges that are not directed and only check for cycles created by the directed edges.

I don't understand. Is the purpose of isCyclic to check if the directed part of the mixed graph has a directed cycle, ignoring the bidirected and undirected parts completely? This is not the definition of cycle according to the mixed graphs paper. (But it would be compatible with the definition of cycles for less complex mixed graphs such as those modeling latent variables).

However, this is not what the code does, not even close to what it tries to do. The code identifies vertices that have (separately) undirected or bidirected edges between them and then considers directed edges between those classes. This part is in line with the documentation:

In the next example, there are no cycles inside the digraph of the mixed graph, but there is a directed cycle after you identify the vertices {1,2} and {3,4}.

Lastly, I want to note that the current implementation of isCyclic in StatGraphs uses an algorithm which does not always produce the correct result in the sense of Lauritzen and Sadeghi. I haven't found this algorithm anywhere in the literature and why it is correct (or which definition of "cycle" it relies on). Here is an example:

needsPackage "StatGraphs";
U = graph({2,3},{{2,3}});
D = digraph({1,2},{{1,2}});
B = bigraph({1,3},{{1,3}});
G = mixedGraph(U,D,B);
print isCyclic G --> false

The algorithm computes an auxiliary directed graph on the connected components of U and B, so the edge set of that auxiliary graph is {{1,3}, {2,3}} and those two nodes have a single directed edge between them since 1 -> 2 is a directed edge in the original graph. This digraph is not cyclic. However, the original mixed graph is cyclic according to Lauritzen and Sadeghi.

I'm still puzzled by what the intent behind isCyclic should be. If it should only check cyclicity of the digraph part, then both its implementation and its documentation are way off (and that is why test no. 13 is failing because of my change to the Graphs package). If it should check Lauritzen and Sadeghi's definition, then its implementation is wrong and the test assertion is wrong. Both can be fixed easily, I just don't know which behavior is intended.

@olgakuznetsova
Copy link
Contributor

olgakuznetsova commented Jan 28, 2025 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants