Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem of calculating gradient from embedding to word #9

Open
pl8787 opened this issue Aug 13, 2018 · 4 comments
Open

Problem of calculating gradient from embedding to word #9

pl8787 opened this issue Aug 13, 2018 · 4 comments

Comments

@pl8787
Copy link

pl8787 commented Aug 13, 2018

The model can only get gradient for embedding layer. If the input of the model is word id and use Embedding Layer, the Integrated-Gradients return an error.

ValueError                                Traceback (most recent call last)
<ipython-input-8-223e71b35bf0> in <module>()
----> 1 ig = integrated_gradients(model)
      2 exp = ig.explain([X_val[0][1], X_val[1][1], X_val[2][1], X_val[3][1]] )

./IntegratedGradients/IntegratedGradients.py in __init__(self, model, outchannels, verbose)
     74             # Get tensor that calculates gradient
     75             if K.backend() == "tensorflow":
---> 76                 gradients = self.model.optimizer.get_gradients(self.model.output[:, c], self.model.input)
     77             if K.backend() == "theano":
     78                 gradients = self.model.optimizer.get_gradients(self.model.output[:, c].sum(), self.model.input)

~/.local/lib/python3.6/site-packages/keras/optimizers.py in get_gradients(self, loss, params)
     78         grads = K.gradients(loss, params)
     79         if None in grads:
---> 80             raise ValueError('An operation has `None` for gradient. '
     81                              'Please make sure that all of your ops have a '
     82                              'gradient defined (i.e. are differentiable). '

ValueError: An operation has `None` for gradient. Please make sure that all of your ops have a gradient defined (i.e. are differentiable). Common ops without gradient: K.argmax, K.round, K.eval.

However, I can get the gradient of embedding, how to determine the value of single words, for example in paper Section 6.3 Question Classification?

@hiranumn
Copy link
Owner

Embedding layers by definition do not pass gradient through it. The layer uses dictionally lookup from word index to whatever dimension embedding that you decide to use. There is no gradient going back to the indexes themselves.

However, there are gradients flowing back to the embedded vectors, so you can sum the attribution to the vectors to get the actual attribution values. At the end of the day, embedding layers are equivalent to fully connected layers with one-hot encoded inputs.

@pl8787
Copy link
Author

pl8787 commented Aug 22, 2018

Thanks, I have modified code to set the start nodes of the network in https://github.com/pl8787/IntegratedGradients.

@hiranumn
Copy link
Owner

Sounds good!

@inspirepassion
Copy link

I also encounter this gradient value to be None issue, so after I check the code I solved by switching the default None option to Zero. Here, in this file
/Users/username/opt/anaconda3/envs/ml/lib/python3.7/site-packages/tensorflow/python/ops/gradients_impl.py

there is this function signature:
iShot_2022-08-11_15 24 06

It actually lets you to choose default output for those unconnected node's gradient output. After I chose ZERO that issue solved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants