Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Graphs as States #210

Open
wants to merge 61 commits into
base: master
Choose a base branch
from
Open

Add Graphs as States #210

wants to merge 61 commits into from

Conversation

alip67
Copy link
Collaborator

@alip67 alip67 commented Nov 6, 2024

Description:
Unlike the current States object that necessitates appending dummy states to batch trajectories of varying lengths, our approach aims to support Trajectories through a nested Batch object representation. The Data class in Torch Geometric represents the graph structure, while the Batch class, which encapsulates batching of Data objects and their efficient indexing, represents the GraphStates object.

The current implementation of Trajectory supports the indexing dimensions: (Num time steps, Num trajectories, State Size). By using a nested Batch of Batch object to represent state Trajectories, the indexing would inherently take the form (Num trajectories, Num timesteps, State size). This approach requires implementing logic within _getitem_() and _setitem_() to internally.

To Do:
Compatibility check with Trajectories, Transition class

@alip67 alip67 marked this pull request as draft November 6, 2024 12:43
@alip67 alip67 marked this pull request as ready for review November 6, 2024 12:50
@alip67 alip67 marked this pull request as draft November 6, 2024 12:51
src/gfn/gym/graph_building.py Outdated Show resolved Hide resolved
src/gfn/gym/graph_building.py Outdated Show resolved Hide resolved
@saleml
Copy link
Collaborator

saleml commented Dec 6, 2024

Thank you @younik and @alip67 for this important PR. Is there a script we can play with to see the training of the environment you created?

@younik
Copy link
Collaborator

younik commented Dec 6, 2024

Thank you @younik and @alip67 for this important PR. Is there a script we can play with to see the training of the environment you created?

There are still some issues to fix that prevent it from running properly.
I am working to fix them and will post a sample code for using it

@younik
Copy link
Collaborator

younik commented Jan 14, 2025

It looks like GitHub is down, I will rerun CI tomorrow, but they are green locally

Copy link
Collaborator

@josephdviviano josephdviviano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First pass at a review. Thanks so much for your amazing work. I am going to mess around with the test cases and environment next to get a better understanding of how things function together. In the meantime, I have some questions.

pyproject.toml Outdated Show resolved Hide resolved
src/gfn/actions.py Show resolved Hide resolved
src/gfn/actions.py Show resolved Hide resolved
@@ -255,21 +257,22 @@ def _step(
)

new_sink_states_idx = actions.is_exit
new_states.tensor[new_sink_states_idx] = self.sf
sf_tensor = self.States.make_sink_states_tensor((new_sink_states_idx.sum(),))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a comment would be worth adding here.

"""
self.s0 = s0.to(device_str)
self.features_dim = s0["node_feature"].shape[-1]
self.sf = sf
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we could have a special NoneTensorDict GraphState which acts like None but passes the relevant checks?

self.check_output_dim(out)
self._output_dim_is_checked = True

assert out.shape[-1] == 1
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also seems like a much harder constraint.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? The expected_output_dim is 1 in this class, so it seems the same to me (actually softer as I don't check the dtype).

) * edge_index_probs + epsilon * uniform_dist_probs
dists["edge_index"] = CategoricalIndexes(probs=edge_index_probs)

dists["features"] = Normal(module_output["features"], temperature)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree that we should fix it here (i.e., in the Estimator), or if we do so, we should implement masking properly in the States class before the release of V2 because this is a clear design pattern violation.

There is no problem adding multiple edges to-from the same node in principle (in say, a multi-attribute graph) but that does make things way more complex, and I think we can safely avoid that complexity here, as single edges to-from the same nodes will cover a lot of AI for Science applications in the near term.

src/gfn/samplers.py Show resolved Hide resolved
src/gfn/states.py Show resolved Hide resolved
src/gfn/utils/distributions.py Show resolved Hide resolved
pyproject.toml Show resolved Hide resolved
@saleml
Copy link
Collaborator

saleml commented Jan 28, 2025

The code runs, thanks for the last change.

A few suggestions/questions:

  • If I understand correctly, the goal of the script is to sample ring graphs. I expect that during training, the proportion of sample graphs that are rings gets higher with time. Is it possible to validate that during the training loop? For example, every 10-50-100-whatever iterations, you generate N graphs, and see which ones are rings.
  • Could you add short descriptions of the functions you are dealing with? For example, the state_evaluator function seems to be central in the code. It is important to know what it does.
  • Could you merge master into this? Note that this might lead to pyright issues. You can solve most, but some, such as gflownet.loss(), you can ignore, using the same comment used to ignore pyright issues from other scripts

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants