Skip to content

Latest commit

 

History

History
35 lines (26 loc) · 1.38 KB

GRADIENT_CHECKING.md

File metadata and controls

35 lines (26 loc) · 1.38 KB

📚 Gradient Checking

Gradient checking is a way to estimage the gradient of weights thanks to the following approximation:

$$ \delta w = \nabla_{w}Loss \approx \frac{Loss(w+h) - Loss(w-h)}{2h} $$

This estimation allows to challenge the theoretical gradients given by GrAIdient during the backward pass.

GrAIdient implements gradient checking tests for the different available layers to ensure their backward API to operate the right gradient formula according to their forward API. We want the following difference to be very small:

$$ \frac{\left\Vert \delta w^{approx} - \delta w^{GrAIdient} \right\Vert_{2}} {\left\Vert \delta w^{approx} \right\Vert_{2} + \left\Vert \delta w^{GrAIdient} \right\Vert_{2}} = \epsilon $$

Note that one way to compute gradient checking could be to duplicate the model and operate one single weight modification per model copy and run the forward pass on each and every one of these copies. The problem with this solution is that the whole estimation logic appears at the Model level.

The current solution prefers to deletage the weight modification on the Layer component because the Layer is already the low level component where the signal flows forward and backward. Thus, forwardGC appears as a new way to make the signal flow.

Next Chapter

Previous chapter: Optimizer.
Next chapter: Plugin.