Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize transfer operator computation #423

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

rusandris
Copy link
Contributor

Initial illustrative PR (draft):

  • replace visitors transition counting with a simple loop over the symbolic trajectory in transferoperator
  • remove visitors from TransferOperatorApproximationRectangular
  • remove overwriting L, use N to store number of symbols (alphabet size)
  • lacks boundary conditions, stochasticity checking etc.

@rusandris
Copy link
Contributor Author

Latest changes:

  • add boundary conditions again, default is :none
  • remove sort_idxs from TransferOperatorApproximationRectangular
  • separate helper functions to utils.jl
  • add logistic map invariant measure test
  • rename some variables to hint towards more general usage, add comments

@rusandris rusandris marked this pull request as ready for review August 27, 2024 15:16
@Datseris
Copy link
Member

I am on holiday now, can review seriously at end of September. In the meantime would be nice if you can update this with main branch so that tests can run.

Copy link

codecov bot commented Sep 7, 2024

Codecov Report

Attention: Patch coverage is 72.85714% with 19 lines in your changes missing coverage. Please review.

Project coverage is 89.65%. Comparing base (9fcf9d7) to head (13a856f).
Report is 6 commits behind head on main.

Files with missing lines Patch % Lines
src/outcome_spaces/transfer_operator/utils.jl 69.64% 17 Missing ⚠️
...come_spaces/transfer_operator/transfer_operator.jl 85.71% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #423      +/-   ##
==========================================
- Coverage   91.97%   89.65%   -2.33%     
==========================================
  Files          86       87       +1     
  Lines        2468     2475       +7     
==========================================
- Hits         2270     2219      -51     
- Misses        198      256      +58     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@Datseris Datseris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please post here the speed up versus the previous version?

visited_bins = map(pᵢ -> encode(encoding, pᵢ), pts)
sort_idxs = sortperm(visited_bins)
outcomes = map(pᵢ -> encode(encoding, pᵢ), pts)
#sort_idxs = sortperm(visited_bins)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't have any commented-out code here. Comments should only be to explain the source code, not comment-out runnable julia code. If there is a todo on github then that's enough, no need for comments as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think those todo comments were left by @kahaaga . It seemed like unfinished business so I left them there for the time being. Maybe he can enlighten us about them

@@ -206,7 +153,7 @@ rectangular partition given by the `binning`.
"""
function transferoperator(pts::AbstractStateSpaceSet{D, T},
binning::Union{FixedRectangularBinning, RectangularBinning};
boundary_condition = :circular,
boundary_condition = :none,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this breaking? We change the boundary conditions. Will it give different results?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't expect it to be breaking. My idea here was to avoid introducing an extra transition that's not present in the symbolic time series (hence changed the default from :circular to :none). But this definitely need to be discussed, and tests need to be added.
Plus, if the symbolic time series is long enough, the differences in results are expected to be negligible

# one point of the orbit visiting a bin.
target_bin_j::Int = 0
n_visitsᵢ::Int = 0
visits_whichbin,unique_outcomes = inds_in_terms_of_unique(outcomes, false) # set to true when sorting is fixed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this comment valid here given your rework of this PR? What sorting is it talking about?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same deal, I think those comments were left by @kahaaga . My guess is it's related to how we remap the original symbols to unique indices (so a symbolic time series like [12,34,15] would become [1,2,3] internally) and use those to index into the matrix.
Maybe we can discuss those at some later stage

# Transfer operator is just the normalized transition probabilities between the boxes.
TO = sparse(I, J, P)
#normalize Q (not strictly necessary) and fill P by normalizing rows of Q
Q .= Q./sum(Q)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Q .= Q./sum(Q)
Q ./= sum(Q)

@@ -4,6 +4,7 @@ DimensionalData = "0703355e-b756-11e9-17c0-8b28908087d0"
Distances = "b4f34e82-e78d-54a5-968a-f98e89d6e8f7"
Distributions = "31c24e10-a181-5473-b8eb-7969acd0382f"
DynamicalSystemsBase = "6e36e845-645a-534a-86f2-f5d4aa5a06b4"
Integrals = "de52edbc-65ea-441a-8357-d3a637375a31"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where do we use this dependency in the source code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was a comment in the tranfer_operator.jl test file:

# There's not easy way of constructing an analytical example for the resulting
# probabilities, because they are computed according to the approximation to the
# transfer operator, not as naive visitation freqiencies to the bins. We just test
# generically here.
# TODO: make a stupidly simple example where we can actually compute the measure of
# each bin exactly.

So I've added a test for the logistic map (r=4) where the invariant measure is known analytically ρ(x) = 1/(π√x(1-x)). The test compares the probability of each bin using transferoperator and invariantmeasure with the probabilities obtained by integrating the analytical invariant measure over the same bins (10 bins were used).

Intregrals is used for the numerical integration. It doesn't appear anywhere in the source code, only in the unit test (notice that the dependency is only added to test/Project.toml ).

@rusandris
Copy link
Contributor Author

Can you please post here the speed up versus the previous version?

Using the same benchmarks as before:

using DynamicalSystems
using BenchmarkTools
using ComplexityMeasures

#Henon system orbit
henon_rule(x, p, n) = SVector{2}(1.0 - p[1]*x[1]^2 + x[2], p[2]*x[1])
henon = DeterministicIteratedMap(henon_rule, zeros(2), [1.4, 0.3])
orbit, t = trajectory(henon, 10^7; Ttr = 10^4)
grid_size = 10 #10x10 grid binning

@btime begin
binning = RectangularBinning(grid_size)
to = ComplexityMeasures.transferoperator(orbit, binning);
P_cm = to.transfermatrix
end

Results:

#--------previous version---------------
1.749 s (1540 allocations: 828.16 MiB)
#--------new version---------------
777.273 ms (238 allocations: 152.62 MiB)

@Datseris
Copy link
Member

@kahaaga this goes in, right?

@Datseris
Copy link
Member

you mentioned you used some internal code somewhere, which i guess you can obtain by checking out an explicit commit?

@rusandris
Copy link
Contributor Author

@kahaaga this goes in, right?

Any ideas on this PR? @kahaaga


export
TransferOperator, # the probabilities estimator
InvariantMeasure, invariantmeasure,
transfermatrix
transfermatrix, transferoperator
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a new name transferoperator is exported as public but is not added in the docs.

Copy link
Member

@Datseris Datseris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please bump minor version and add a changelog entry discussing the changes, then I'll merge. (also put the exported function to the docs)

@kahaaga
Copy link
Member

kahaaga commented Nov 20, 2024

I don't expect it to be breaking. My idea here was to avoid introducing an extra transition that's not present in the symbolic time series (hence changed the default from :circular to :none). But this definitely need to be discussed, and tests need to be added. Plus, if the symbolic time series is long enough, the differences in results are expected to be negligible

The circular boundary condition is a remnant from v1 of the package, where we had a simplex-based volume outcome space specifically dedicated to estimation from very short time series. In this case, not handling boundary conditions would be potentially detrimental to the estimation. For this outcome space, we'd approximate the transfer operator by volume overlaps between simplices, also using piecewise one-time-step linear projection of the simplex vertices. There would be an edge case where the projection of a simplex could lead to the projected simplex having a vertex outside the convex hull of the other simplices in the outcome space's volume. That would lead to a transition matrix where row sums didn't sum to 1, which would not be meaningful. The circular boundary condition remedied this, at the expense of some (assumed) minor bias.

However, the outcome space hasn't yet made it back in the package since we redesigned it. Therefore, a boundary condition of :none would not be breaking, I think. We can leave this as is, and deal with it properly in the next PR that deals with generalizing the transfer operator computation to all the outcome spaces.

So I've added a test for the logistic map (r=4) where the invariant measure is known analytically ρ(x) = 1/(π√x(1-x)). The test compares the probability of each bin using transferoperator and invariantmeasure with the probabilities obtained by integrating the analytical invariant measure over the same bins (10 bins were used). Intregrals is used for the numerical integration. It doesn't appear anywhere in the source code, only in the unit test (notice that the dependency is only added to test/Project.toml ).

Excellent! Thanks for adding this analytical example. The dependency is necessary for the tests, so keeping it in the test dependencies is fine.

Same deal, I think those comments were left by @kahaaga . My guess is it's related to how we remap the original symbols to unique indices (so a symbolic time series like [12,34,15] would become [1,2,3] internally) and use those to index into the matrix.
Maybe we can discuss those at some later stage

We can deal with removing all the comments from the source code in the next PR which will probably introduce a more or less complete redesign of the interface for estimating the transfer operator. This PR is about improving speed of computation, not cleaning up the source code.

Using the same benchmarks as before:

The benchmarks are looking great. Thanks!

Otherwise, looks good to me. Before this can be merged, I'm just echoing @Datseris

  • please export the transferoperator function in the docs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants