Implement gradient computation w.r.t. initial conditions #290

jwallwork23 · 2025-02-24T08:16:40Z

Closes #284.

This PR sets up the ability to compute gradients of the QoI w.r.t. the initial conditions. This will be extended to a wider range of controls in #289.

Unit tests are added, which solve a simple R-space problem, where the solution field is multiplied by a scalar at each timestep. Gradients of steady-state, end-time, and time-integrated QoIs are computed and compared against expected values.

(Note a5d87aa is not part of this PR and will be added in #287, while b300b24 and bc2589b will be added in #288.)

jwallwork23 · 2025-02-28T10:11:40Z

goalie/adjoint.py

+                controls = pyadjoint.enlisting.Enlist(self._controls)
                with pyadjoint.stop_annotating():
-                    with tape.marked_nodes(m):
+                    with tape.marked_nodes(controls):
                        tape.evaluate_adj(markings=True)

+                # Compute the gradient on the first subinterval
+                if i == 0 and compute_gradient:
+                    self._gradient = controls.delist(
+                        [control.get_derivative() for control in controls]
+                    )
+


Note: the logic here is based on the logic in the Pyadjoint driver: https://github.com/dolfin-adjoint/pyadjoint/blob/dbf923e656b37333e45267da2e0264a84f160be5/pyadjoint/drivers.py#L5-L33.

stephankramer

This looks really good. I only have some broader questions/thoughts about how we'd expand this, in particular also interface wise, to be used in a gradient based optimisation loop. But yes happy for it to go in as is

How do we actually see this being used in optimisation. So far we've not really dealt with the model being parameterised and needing to be rerun for different parameters. So is the idea to do something like:

class MeshSeqObjective(AdjointMeshSeq):
     _initial_conditions = None
     def get_initial_conditions(self):
        return AttrDict({'u': _initial_conditions})

     def value(self, x):  # evaluate objective function at x
         self._initial_condition.assign(x)
         # now what? how do I rerun the model and return the functional value
         self.solve()
         return self.J

     def derivative(self, x):  # evaluate derivative at x
         self._initial_condition.assign(x)
         self.solve_adjoint(compute_gradient=True) # does this rerun the forward?
         return self.gradient

The reason I ask is that I get a little nervous from these magic properties (like self.gradient here) that rely on them having been computed previously dependent on some changing internal state before you access it. It might be worth resetting ._gradient back to None when we rerun the model without compute_gradient=True

I think in your implementation the gradient for a Functon control will be the L2 Riesz representative - which is a good default I guess. I would advocate being consistent (as I think you are) from the start in our use of the words "derivative" and "gradient" with a derivative being a scalar function in the dual of the tangent of the control space - and thus a Cofunction if the control is a Function - and gradient being a gradient vector (in the tangent control space) that is the Riesz representative of the derivative w.r.t. some choice of inner product. Note that this is not (yet) consistently done in pyadjoint. So at the moment you are using control.get_derivative() which, without any options, by default now returns the L2 Riesz representative (Function). This is so that compute_gradient by default indeed returns a gradient. If you want the derivative (Cofunction) you have to pass options={'Riesz_representative': None}. The reason I bring this up is that sometimes you need to provide both a gradient implementation and a (directional) derivative implementation, e.g.: https://docs.trilinos.org/dev/packages/rol/doc/html/classROL_1_1Objective.html#details

jwallwork23 added the enhancement New feature or request label Feb 24, 2025

jwallwork23 requested review from stephankramer and ddundo February 24, 2025 08:16

jwallwork23 self-assigned this Feb 24, 2025

jwallwork23 added 10 commits February 25, 2025 21:20

Compute gradient

018e512

Unit test for gradient attribute error

3c92b4d

Unit test for computing simple gradient on one timestep

2a2853a

Extend to P1 as well as R-space

825217d

Introduce GradientTestMeshSeq that accepts options

890383d

Use parameterized

276b46a

Consider two timestep case, too

c20a0ee

Unit tests for two subintervals

e170400

Inlining

6274048

Account for time-integrated case

48c4911

jwallwork23 force-pushed the 284_gradient branch from 13cb829 to 48c4911 Compare February 25, 2025 21:21

jwallwork23 linked an issue Feb 28, 2025 that may be closed by this pull request

Compute gradient w.r.t. initial conditions #284

Open

jwallwork23 commented Feb 28, 2025

View reviewed changes

stephankramer approved these changes Mar 3, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement gradient computation w.r.t. initial conditions #290

Implement gradient computation w.r.t. initial conditions #290

jwallwork23 commented Feb 24, 2025

jwallwork23 Feb 28, 2025

stephankramer left a comment

Implement gradient computation w.r.t. initial conditions #290

Are you sure you want to change the base?

Implement gradient computation w.r.t. initial conditions #290

Conversation

jwallwork23 commented Feb 24, 2025

jwallwork23 Feb 28, 2025

Choose a reason for hiding this comment

stephankramer left a comment

Choose a reason for hiding this comment