Problem of calculating gradient from embedding to word #9

pl8787 · 2018-08-13T09:18:19Z

The model can only get gradient for embedding layer. If the input of the model is word id and use Embedding Layer, the Integrated-Gradients return an error.

ValueError                                Traceback (most recent call last)
<ipython-input-8-223e71b35bf0> in <module>()
----> 1 ig = integrated_gradients(model)
      2 exp = ig.explain([X_val[0][1], X_val[1][1], X_val[2][1], X_val[3][1]] )

./IntegratedGradients/IntegratedGradients.py in __init__(self, model, outchannels, verbose)
     74             # Get tensor that calculates gradient
     75             if K.backend() == "tensorflow":
---> 76                 gradients = self.model.optimizer.get_gradients(self.model.output[:, c], self.model.input)
     77             if K.backend() == "theano":
     78                 gradients = self.model.optimizer.get_gradients(self.model.output[:, c].sum(), self.model.input)

~/.local/lib/python3.6/site-packages/keras/optimizers.py in get_gradients(self, loss, params)
     78         grads = K.gradients(loss, params)
     79         if None in grads:
---> 80             raise ValueError('An operation has `None` for gradient. '
     81                              'Please make sure that all of your ops have a '
     82                              'gradient defined (i.e. are differentiable). '

ValueError: An operation has `None` for gradient. Please make sure that all of your ops have a gradient defined (i.e. are differentiable). Common ops without gradient: K.argmax, K.round, K.eval.

However, I can get the gradient of embedding, how to determine the value of single words, for example in paper Section 6.3 Question Classification?

The text was updated successfully, but these errors were encountered:

hiranumn · 2018-08-21T17:45:14Z

Embedding layers by definition do not pass gradient through it. The layer uses dictionally lookup from word index to whatever dimension embedding that you decide to use. There is no gradient going back to the indexes themselves.

However, there are gradients flowing back to the embedded vectors, so you can sum the attribution to the vectors to get the actual attribution values. At the end of the day, embedding layers are equivalent to fully connected layers with one-hot encoded inputs.

pl8787 · 2018-08-22T07:06:40Z

Thanks, I have modified code to set the start nodes of the network in https://github.com/pl8787/IntegratedGradients.

hiranumn · 2018-08-22T19:11:58Z

Sounds good!

inspirepassion · 2022-08-11T22:25:40Z

I also encounter this gradient value to be None issue, so after I check the code I solved by switching the default None option to Zero. Here, in this file
/Users/username/opt/anaconda3/envs/ml/lib/python3.7/site-packages/tensorflow/python/ops/gradients_impl.py

there is this function signature:

It actually lets you to choose default output for those unconnected node's gradient output. After I chose ZERO that issue solved.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Problem of calculating gradient from embedding to word #9

Problem of calculating gradient from embedding to word #9

pl8787 commented Aug 13, 2018

hiranumn commented Aug 21, 2018

pl8787 commented Aug 22, 2018

hiranumn commented Aug 22, 2018

inspirepassion commented Aug 11, 2022

Problem of calculating gradient from embedding to word #9

Problem of calculating gradient from embedding to word #9

Comments

pl8787 commented Aug 13, 2018

hiranumn commented Aug 21, 2018

pl8787 commented Aug 22, 2018

hiranumn commented Aug 22, 2018

inspirepassion commented Aug 11, 2022