Performance of computing partial derivative #77
Labels
enhancement
New feature or request
good first issue
Good for newcomers
question
Further information is requested
Hi,
I am pretty new to neurodiffeq, thank you very much for the excellent library.
I am interested in the way, and the computational speed, of computing partial derivatives w.r.t. the inputs.
Take forward ODE (1D, 1 unknown variable) solver for example, the input is
x
, a batch of coordinates, and the output of the neural network isy
, the approximated solution of the PDE at these coordinates. If view the neural network as a smooth function that simulate the solution and name itf
, the forward part in the training is evaluatingy = f(x)
, and for each element of input,x_i
, the neural network givesy_i = f(x_i)
, thei
increase from 0 to N-1, the batch size. When constructing the loss function, one evaluate the residual of PDE, which usually require evaluating\frac{\partial y_i}{\partial x_i}
and higher order of derivative.My question related to the way of evaluating the
\frac{\partial y_i}{\partial x_i}
, for examplex
is (N, 1) tensor,y
is also (N, 1) tensor, N is the batch size, if you doautograd.grad(y, t, create_graph=True, grad_outputs=ones, allow_unused=True)
as the lines belowhttps://github.com/odegym/neurodiffeq/blob/718f226d40cfbcb9ed50d72119bd3668b0c68733/neurodiffeq/neurodiffeq.py#L21-L22
my understanding is that it will evaluate a Jacobian Matrix of size (N, N) with elements equal to
\frac{\partial y_i}{\partial x_j}
(I, j from 0 to N-1) regardless of the fact thaty_i
only dependent onx_i
and thus computation (and storage) on the non-diagonal elements is useless and unnecessary. In other word, the computation actually can be done by evaluatingN
gradients, but the current method doN * N
times.My question is that:
Thanks!
The text was updated successfully, but these errors were encountered: