Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement inner using BP #147

Merged
merged 41 commits into from
Apr 12, 2024
Merged

Implement inner using BP #147

merged 41 commits into from
Apr 12, 2024

Conversation

JoeyT1994
Copy link
Contributor

This PR refactors the inner interface in ITensorNetworks.jl. Most importantly it adds algorithmic backends where "bp" and "exact" are currently supported.

For instance inner(x::AbstractITensorNetwork, A::AbstractITensorNetwork, y::AbstractITensorNetwork; alg = default_alg) and inner(x::AbstractITensorNetwork, y::AbstractITensorNetwork; alg = default_alg) will default to "bp" with a sensible partitioning if all the arguments are trees and "exact" otherwise.

The assumption in inner(...) is that the externalinds of the arguments are all aligned and thus the networks stack properly under disjoint_union and don't need mapping: although an optional kwarg link index map (which is defaulted to sim) is provided for x to avoid clashes. This is consistent with the interface for ITensors.jl at the moment.

Logarithmic versions of inner, i.e. loginner() are also provided. In the "bp" case this is neatly defined and helps to avoid overflow/ underflow. In the "exact" case it just calls log(inner(...)) so perhaps it is a bit redundant.

Testing is added via test_inner.jl and other standard tests which rely on inner will now direct to this and all pass. Additionally, tests in test_treetensornetworks will pass to this version and not the version in src/treetensornetworksabstracttreetensornetwork.jl unless the tensor network is specifically wrapped in the TTN type.

A few points of note/ things to think about resolving:

  1. If all the elements of a tensor network x are 0 which means that norm_sqr(x::AbstractITensorNetwork) = 0 then the BP method will return NaN as it will return the ratio 0/0. Hence this is an edge case that isn't well determined by the BP code. This is because the current BP backend computes inner() via a global method. A local method is possible to define for a tree to solve this (it is how Benedikt's code in src/treetensornetworksabstracttreetensornetwork.jl works), and we could then define alg = "bp_global" and alg = "bp_local_tree" to let the user choose algorithms.
  2. Currently inner(alg::Algorithm"exact", ...) passes to the inner_network(...) constructor which I cleaned up a bit: inner_network now forwards to flatten_networks but takes the dag on the first argument and its keyword arguments default to no flattening or combining (which are useful kwargs for "exact"). inner(alg::Algorithm"bp",...), instead, internally builds the relevant ITensorNetwork via disjoint_union. This is because the flatten_networks builder results in confusing vertex names for more than two networks and thus it is difficult to parse when trying to define a good default_partitioning.

@mtfishman @b-kloss

@mtfishman
Copy link
Member

Can't the BP code detect if the norm is zero at some point? Presumably the message tensors become zero?

@mtfishman
Copy link
Member

Note that I was advocating for removing hascommoninds(::typeof(siteinds), A::AbstractITensorNetwork, B::AbstractITensorNetwork) and check_hascommoninds(::typeof(siteinds), A::AbstractITensorNetwork, B::AbstractITensorNetwork) here: #144 (comment), in case that is relevant.

@JoeyT1994
Copy link
Contributor Author

Note that I was advocating for removing hascommoninds(::typeof(siteinds), A::AbstractITensorNetwork, B::AbstractITensorNetwork) and check_hascommoninds(::typeof(siteinds), A::AbstractITensorNetwork, B::AbstractITensorNetwork) here: #144 (comment), in case that is relevant.

Hmm yeah I'm not sure what the bug is here because I don't see it locally on my machine. I thought something had changed in ITensors.jl and so it was importing something that no longer exists. Let me try to find the issue.

@JoeyT1994
Copy link
Contributor Author

Can't the BP code detect if the norm is zero at some point? Presumably the message tensors become zero?

Yeah I wonder if the norm of the message tensors is zero then does that guarantee (assuming you didn't initialize the message tensors to 0) that the norm of the wave function is zero? I assume so? In that case we could have that catch in the inner(alg::Algorithm"bp", ...) code?

@mtfishman
Copy link
Member

Can't the BP code detect if the norm is zero at some point? Presumably the message tensors become zero?

Yeah I wonder if the norm of the message tensors is zero then does that guarantee (assuming you didn't initialize the message tensors to 0) that the norm of the wave function is zero? I assume so? In that case we could have that catch in the inner(alg::Algorithm"bp", ...) code?

I suppose a message could end up accidentally being zero without the norm of the network being zero, however that seems unlikely and also seems like it would be a problem for BP anyway since then it would always remain zero and cause other messages to become zero during updates involving that message. I suppose a fix for that situation could be to randomize the message and see if it ends up back at zero or not. But my point is that this seems like something that we should try to handle at the level of the BP code, rather than in a higher level code like inner, since this could be a potential issue for many functions that use BP, not just inner.

@mtfishman
Copy link
Member

I think probably for running BP on a norm network you can show that the only way a message can end up zero is if the network is zero or if it is initialized to zero, since each update is of the form of a CP-map.

@mtfishman
Copy link
Member

mtfishman commented Mar 27, 2024

Note that I was advocating for removing hascommoninds(::typeof(siteinds), A::AbstractITensorNetwork, B::AbstractITensorNetwork) and check_hascommoninds(::typeof(siteinds), A::AbstractITensorNetwork, B::AbstractITensorNetwork) here: #144 (comment), in case that is relevant.

Hmm yeah I'm not sure what the bug is here because I don't see it locally on my machine. I thought something had changed in ITensors.jl and so it was importing something that no longer exists. Let me try to find the issue.

It may be related to ITensor/ITensors.jl#1356, we moved all of the MPS/MPO code into an ITensorMPS submodule of ITensors.jl, which changed the namespaces of some internal functions, which probably included check_hascommoninds (i.e. check_hascommoninds probably isn't available from ITensors, and instead must be accessed from ITensors.ITensorMPS).

But if we have to circumvent issues around check_hascommoninds anyway, I think we may as well just remove it since I don't think it is serving a good purpose in ITensorNetworks.

@JoeyT1994
Copy link
Contributor Author

Okay I can remove it, shall I remove the calls to check_hascommoninds in the treetensornetwork code then? Presumably it is just being used for some assertions

@mtfishman
Copy link
Member

Okay I can remove it, shall I remove the calls to check_hascommoninds in the treetensornetwork code then? Presumably it is just being used for some assertions

Yeah, thanks! Can you also remove hascommoninds(::typeof(siteinds), A::AbstractITensorNetwork, B::AbstractITensorNetwork) as well?

@JoeyT1994
Copy link
Contributor Author

@mtfishman actually this seems to be happening with more than just check_hascommoninds due to the ITensors.jl changes, I can try to hunt them down

@mtfishman
Copy link
Member

Seems like we may as well just remove src/treetensornetworks/solvers/deprecated entirely instead of trying to update it.

@mtfishman
Copy link
Member

Also src/treetensornetworks/deprecated_opsum_to_ttn.jl.

@mtfishman
Copy link
Member

@JoeyT1994 reconsidering this, I'm concerned this PR is going to get messy with all of these other changes. Do you mind if I make a PR updating to the latest ITensor version, removing deprecated code, removing check_hascommoninds, etc. and then you can merge that into this PR?

@JoeyT1994
Copy link
Contributor Author

I think probably for running BP on a norm network you can show that the only way a message can end up zero is if the network is zero or if it is initialized to zero, since each update is of the form of a CP-map.

Yes you are right that if it is a CP-map a message being zero directly implies on of those two things. Perhaps in a non CP-map there is a possibility the outgoing message ends up being zero by virtue of the incoming message being a zero eigenvector of the local tensor - but that seems a weird case and likely implies the tensor contracts to 0 too.

In principle if a message is 0 then the BP code should rapidly converge to 0 in a few sweeps of the code (as all the other messages end up zero) and it implies something about the tensornetwork. So it seems to me this isn't actually something that the BP code needs to catch for and is part of the algorithm. The exception is if we are trying to normalize the message tensors, in which case we should flag an error?

examples/contract_bp.jl Outdated Show resolved Hide resolved
src/inner.jl Show resolved Hide resolved
src/inner.jl Show resolved Hide resolved
@JoeyT1994
Copy link
Contributor Author

@mtfishman let me know if you have any further comments or if you are happy to merge

@mtfishman
Copy link
Member

I'll take another look through the PR.

@mtfishman
Copy link
Member

mtfishman commented Apr 12, 2024

@JoeyT1994 I merged #155 into main and then tried merging it into this branch and fixing conflicts but it seems like it broke some BP tests... I had to fix some issues in the tests in #155 because of changes to the normalization of certain tensor network constructors but I'm not sure if that is related.

@mtfishman mtfishman changed the title Inner Refactor Inner refactor Apr 12, 2024
@JoeyT1994
Copy link
Contributor Author

Okay no worries let me pull that merge to my local machine and try to resolve the test failures.

@mtfishman
Copy link
Member

Thanks. That change was supposed to be pretty benign but perhaps the behavior of some of the constructors changed accidentally.

@JoeyT1994
Copy link
Contributor Author

Yeah the issue is that now scalar(alg::Algorithm"bp", tn::AbstractITensorNetwork, ...) seems to not be returning the correct answer any more. Can't quite figure out why yet.

@JoeyT1994
Copy link
Contributor Author

Okay actually I don't think its a BP issue:
The inner_network constructor (which passes to BiLinearFormNetwork and inserts an identity operator in the middle) is doing something different now than what it was, i.e.

ψ = random_tensornetwork(s; link_space=χ)
contract(inner_network(ψ, ψ)) == contract(ψ ⊗ prime(dag(ψ); sites=[]))

Now gives false.

@JoeyT1994
Copy link
Contributor Author

JoeyT1994 commented Apr 12, 2024

I see where the issue is, its the delta_network(union_all_inds(siteinds(ket), dual_site_index_map(siteinds(ket)))) constructor, it appears to be inserting matrices like:

1 0
0 0

on the vertices instead of identity matrices.

@mtfishman
Copy link
Member

I think that may have been an incorrect usage of delta_network, see my messages on Slack.

@mtfishman mtfishman merged commit f813653 into ITensor:main Apr 12, 2024
6 of 11 checks passed
@mtfishman mtfishman changed the title Inner refactor Implement inner using BP Apr 12, 2024
@JoeyT1994 JoeyT1994 deleted the logcontract branch June 16, 2024 13:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants