-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ENH, MAINT] Refactor directed-undirected graph class #72
Draft
jaron-lee
wants to merge
7
commits into
py-why:main
Choose a base branch
from
jaron-lee:refactor_classes
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Changes from 2 commits
Commits
Show all changes
7 commits
Select commit
Hold shift + click to select a range
7a57e01
add diungraph
jaron-lee 328a656
fix failing test
jaron-lee 1d3e990
template for is_valid functions
jaron-lee 7800aee
add framework for chain graph function and tests
jaron-lee d7f71b1
fix chain graph validity function
jaron-lee 79f7457
update changelog
jaron-lee 7d69f7f
fix spelling errors
jaron-lee File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -8,67 +8,8 @@ | |
from .base import AncestralMixin, ConservativeMixin | ||
|
||
|
||
class CPDAG(pywhy_nx.MixedEdgeGraph, AncestralMixin, ConservativeMixin): | ||
"""Completed partially directed acyclic graphs (CPDAG). | ||
|
||
CPDAGs generalize causal DAGs by allowing undirected edges. | ||
Undirected edges imply uncertainty in the orientation of the causal | ||
relationship. For example, ``A - B``, can be ``A -> B`` or ``A <- B``, | ||
allowing for a Markov equivalence class of DAGs for each CPDAG. | ||
|
||
Parameters | ||
---------- | ||
incoming_directed_edges : input directed edges (optional, default: None) | ||
Data to initialize directed edges. All arguments that are accepted | ||
by `networkx.DiGraph` are accepted. | ||
incoming_undirected_edges : input undirected edges (optional, default: None) | ||
Data to initialize undirected edges. All arguments that are accepted | ||
by `networkx.Graph` are accepted. | ||
directed_edge_name : str | ||
The name for the directed edges. By default 'directed'. | ||
undirected_edge_name : str | ||
The name for the directed edges. By default 'undirected'. | ||
attr : keyword arguments, optional (default= no attributes) | ||
Attributes to add to graph as key=value pairs. | ||
|
||
See Also | ||
-------- | ||
networkx.DiGraph | ||
networkx.Graph | ||
pywhy_graphs.ADMG | ||
pywhy_graphs.networkx.MixedEdgeGraph | ||
|
||
Notes | ||
----- | ||
CPDAGs are Markov equivalence class of causal DAGs. The implicit assumption in | ||
these causal graphs are the Structural Causal Model (or SCM) is Markovian, inducing | ||
causal sufficiency, where there is no unobserved latent confounder. This allows CPDAGs | ||
to be learned from score-based (such as the "GES" algorithm) and constraint-based | ||
(such as the PC algorithm) approaches for causal structure learning. | ||
|
||
One should not use CPDAGs if they suspect their data has unobserved latent confounders. | ||
|
||
**Edge Type Subgraphs** | ||
|
||
The data structure underneath the hood is stored in two networkx graphs: | ||
``networkx.Graph`` and ``networkx.DiGraph`` to represent the non-directed | ||
edges and directed edges. Non-directed edges in an CPDAG can be present as | ||
undirected edges standing for uncertainty in which directino the directed | ||
edge is in. | ||
|
||
- Directed edges (<-, ->, indicating causal relationship) = `networkx.DiGraph` | ||
The subgraph of directed edges may be accessed by the | ||
`CPDAG.sub_directed_graph`. Their edges in networkx format can be | ||
accessed by `CPDAG.directed_edges` and the corresponding name of the | ||
edge type by `CPDAG.directed_edge_name`. | ||
- Undirected edges (--, indicating uncertainty) = `networkx.Graph` | ||
The subgraph of undirected edges may be accessed by the | ||
`CPDAG.sub_undirected_graph`. Their edges in networkx format can be | ||
accessed by `CPDAG.undirected_edges` and the corresponding name of the | ||
edge type by `CPDAG.undirected_edge_name`. | ||
|
||
By definition, no cycles may exist due to the directed edges. | ||
""" | ||
class DiUnGraph(pywhy_nx.MixedEdgeGraph, AncestralMixin): | ||
""" """ | ||
|
||
def __init__( | ||
self, | ||
|
@@ -85,15 +26,6 @@ def __init__( | |
self._directed_name = directed_edge_name | ||
self._undirected_name = undirected_edge_name | ||
|
||
from pywhy_graphs import is_valid_mec_graph | ||
|
||
# check that construction of PAG was valid | ||
is_valid_mec_graph(self) | ||
|
||
# extended patterns store unfaithful triples | ||
# these can be used for conservative structure learning algorithm | ||
self._unfaithful_triples: Dict[FrozenSet[Node], None] = dict() | ||
|
||
@property | ||
def undirected_edge_name(self) -> str: | ||
"""Name of the undirected edge internal graph.""" | ||
|
@@ -184,6 +116,90 @@ def possible_parents(self, n: Node) -> Iterator[Node]: | |
""" | ||
return self.sub_undirected_graph().neighbors(n) | ||
|
||
|
||
class CPDAG(DiUnGraph, ConservativeMixin): | ||
"""Completed partially directed acyclic graphs (CPDAG). | ||
|
||
CPDAGs generalize causal DAGs by allowing undirected edges. | ||
Undirected edges imply uncertainty in the orientation of the causal | ||
relationship. For example, ``A - B``, can be ``A -> B`` or ``A <- B``, | ||
allowing for a Markov equivalence class of DAGs for each CPDAG. | ||
|
||
Parameters | ||
---------- | ||
incoming_directed_edges : input directed edges (optional, default: None) | ||
Data to initialize directed edges. All arguments that are accepted | ||
by `networkx.DiGraph` are accepted. | ||
incoming_undirected_edges : input undirected edges (optional, default: None) | ||
Data to initialize undirected edges. All arguments that are accepted | ||
by `networkx.Graph` are accepted. | ||
directed_edge_name : str | ||
The name for the directed edges. By default 'directed'. | ||
undirected_edge_name : str | ||
The name for the directed edges. By default 'undirected'. | ||
attr : keyword arguments, optional (default= no attributes) | ||
Attributes to add to graph as key=value pairs. | ||
|
||
See Also | ||
-------- | ||
networkx.DiGraph | ||
networkx.Graph | ||
pywhy_graphs.ADMG | ||
pywhy_graphs.networkx.MixedEdgeGraph | ||
|
||
Notes | ||
----- | ||
CPDAGs are Markov equivalence class of causal DAGs. The implicit assumption in | ||
these causal graphs are the Structural Causal Model (or SCM) is Markovian, inducing | ||
causal sufficiency, where there is no unobserved latent confounder. This allows CPDAGs | ||
to be learned from score-based (such as the "GES" algorithm) and constraint-based | ||
(such as the PC algorithm) approaches for causal structure learning. | ||
|
||
One should not use CPDAGs if they suspect their data has unobserved latent confounders. | ||
|
||
**Edge Type Subgraphs** | ||
|
||
The data structure underneath the hood is stored in two networkx graphs: | ||
``networkx.Graph`` and ``networkx.DiGraph`` to represent the non-directed | ||
edges and directed edges. | ||
|
||
- Directed edges (<-, ->, indicating causal relationship) = `networkx.DiGraph` | ||
The subgraph of directed edges may be accessed by the | ||
`CPDAG.sub_directed_graph`. Their edges in networkx format can be | ||
accessed by `CPDAG.directed_edges` and the corresponding name of the | ||
edge type by `CPDAG.directed_edge_name`. | ||
- Undirected edges (--, indicating uncertainty) = `networkx.Graph` | ||
The subgraph of undirected edges may be accessed by the | ||
`CPDAG.sub_undirected_graph`. Their edges in networkx format can be | ||
accessed by `CPDAG.undirected_edges` and the corresponding name of the | ||
edge type by `CPDAG.undirected_edge_name`. | ||
|
||
By definition, no cycles may exist due to the directed edges. | ||
""" | ||
|
||
def __init__( | ||
self, | ||
incoming_directed_edges=None, | ||
incoming_undirected_edges=None, | ||
directed_edge_name: str = "directed", | ||
undirected_edge_name: str = "undirected", | ||
**attr, | ||
): | ||
super().__init__( | ||
incoming_directed_edges=incoming_directed_edges, | ||
incoming_undirected_edges=incoming_undirected_edges, | ||
directed_edge_name=directed_edge_name, | ||
undirected_edge_name=undirected_edge_name, | ||
) | ||
from pywhy_graphs import is_valid_mec_graph | ||
|
||
# check that construction of PAG was valid | ||
is_valid_mec_graph(self) | ||
|
||
# extended patterns store unfaithful triples | ||
# these can be used for conservative structure learning algorithm | ||
self._unfaithful_triples: Dict[FrozenSet[Node], None] = dict() | ||
|
||
def add_edge(self, u_of_edge, v_of_edge, edge_type="all", **attr): | ||
from pywhy_graphs.algorithms.generic import _check_adding_cpdag_edge | ||
|
||
|
@@ -200,3 +216,87 @@ def add_edges_from(self, ebunch_to_add, edge_type, **attr): | |
self, u_of_edge=u_of_edge, v_of_edge=v_of_edge, edge_type=edge_type | ||
) | ||
return super().add_edges_from(ebunch_to_add, edge_type, **attr) | ||
|
||
|
||
class CG(DiUnGraph): | ||
"""Chain Graphs (CG). | ||
|
||
Chain graphs represent a generalization of DAGs and undirected graphs. | ||
Undirected edges ``A - B`` in a chain graph represent a symmetric association of | ||
two variables due to processes such as dynamic feedback (where ``A`` | ||
influences ``B`` and vice versa) or an artefact of selection bias (where the selection | ||
of the sample induces association between ``A`` and ``B``) [1]_. | ||
|
||
|
||
The implementation supports representation of both Lauritzen-Wermuth-Frydenberg (LWF) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I like this. Maybe citations? |
||
and Andersen-Madigan-Perlman (AMP) chain graphs. | ||
|
||
|
||
Parameters | ||
---------- | ||
incoming_directed_edges : input directed edges (optional, default: None) | ||
Data to initialize directed edges. All arguments that are accepted | ||
by `networkx.DiGraph` are accepted. | ||
incoming_undirected_edges : input undirected edges (optional, default: None) | ||
Data to initialize undirected edges. All arguments that are accepted | ||
by `networkx.Graph` are accepted. | ||
directed_edge_name : str | ||
The name for the directed edges. By default 'directed'. | ||
undirected_edge_name : str | ||
The name for the directed edges. By default 'undirected'. | ||
attr : keyword arguments, optional (default= no attributes) | ||
Attributes to add to graph as key=value pairs. | ||
|
||
References | ||
---------- | ||
.. [1] Lauritzen, Steffen L., and Thomas S. Richardson. "Chain | ||
graph models and their causal interpretations." Journal of the | ||
Royal Statistical Society: Series B (Statistical Methodology) | ||
64.3 (2002): 321-348. | ||
|
||
|
||
|
||
|
||
See Also | ||
-------- | ||
networkx.DiGraph | ||
networkx.Graph | ||
pywhy_graphs.ADMG | ||
pywhy_graphs.networkx.MixedEdgeGraph | ||
|
||
Notes | ||
----- | ||
**Edge Type Subgraphs** | ||
|
||
The data structure underneath the hood is stored in two networkx graphs: | ||
``networkx.Graph`` and ``networkx.DiGraph`` to represent the non-directed | ||
edges and directed edges. | ||
|
||
- Directed edges (<-, ->, indicating causal relationship) = `networkx.DiGraph` | ||
The subgraph of directed edges may be accessed by the | ||
`CG.sub_directed_graph`. Their edges in networkx format can be | ||
accessed by `CG.directed_edges` and the corresponding name of the | ||
edge type by `CG.directed_edge_name`. | ||
- Undirected edges (--, indicating uncertainty) = `networkx.Graph` | ||
The subgraph of undirected edges may be accessed by the | ||
`CG.sub_undirected_graph`. Their edges in networkx format can be | ||
accessed by `CG.undirected_edges` and the corresponding name of the | ||
edge type by `CG.undirected_edge_name`. | ||
|
||
By definition, no cycles may exist due to the directed edges. | ||
""" | ||
|
||
def __init__( | ||
self, | ||
incoming_directed_edges=None, | ||
incoming_undirected_edges=None, | ||
directed_edge_name: str = "directed", | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. whitespace around " = " |
||
undirected_edge_name: str = "undirected", | ||
**attr, | ||
): | ||
super().__init__( | ||
incoming_directed_edges=incoming_directed_edges, | ||
incoming_undirected_edges=incoming_undirected_edges, | ||
directed_edge_name=directed_edge_name, | ||
undirected_edge_name=undirected_edge_name, | ||
) |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps it is worth adding a note or warning even to the classes that validity is not guaranteed. Please use the validity check if you need to rigorously check that your construction is valid?
Same in chaingraph? Maybe also in the ADMG, and PAG too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah this is a good idea. I will address just CPDAG and Chain Graph in this PR but will look to redo the other classes in another PR.