Add Grad-CAM++

1202kbs · Jun 20, 2018 · 2a86b83 · 2a86b83
1 parent ee3f3c9
commit 2a86b83
Show file tree

Hide file tree

Showing 5 changed files with 442 additions and 1 deletion.
diff --git a/4.3 Grad-CAM++.ipynb b/4.3 Grad-CAM++.ipynb
diff --git a/README.md b/README.md
@@ -40,6 +40,8 @@ It seems that Github is unable to render some of the equations in the notebooks.
 
 [4.2 Grad-CAM](http://nbviewer.jupyter.org/github/1202kbs/Understanding-NN/blob/master/4.2%20Grad-CAM.ipynb)
 
+[4.2 Grad-CAM++](http://nbviewer.jupyter.org/github/1202kbs/Understanding-NN/blob/master/4.3%20Grad-CAM++.ipynb)
+
 [5.1 Explanation Continuity](http://nbviewer.jupyter.org/github/1202kbs/Understanding-NN/blob/master/5.1%20Explanation%20Continuity.ipynb)
 
 [5.2 Explanation Selectivity](http://nbviewer.jupyter.org/github/1202kbs/Understanding-NN/blob/master/5.2%20Explanation%20Selectivity.ipynb)
@@ -144,7 +146,7 @@ Implementation of various types of gradient-based visualization methods such as
 
 ## 4 Class Activation Map
 
-Implementation of Class Activation Map (CAM) and its generalized version, Grad-CAM on the [cluttered MNIST](https://github.com/deepmind/mnist-cluttered) dataset.
+Implementation of Class Activation Map (CAM) and its generalized versions, Grad-CAM and Grad-CAM++ the [cluttered MNIST](https://github.com/deepmind/mnist-cluttered) dataset.
 
 
 ### 4.1 Class Activation Map
@@ -161,6 +163,13 @@ Implementation of Class Activation Map (CAM) and its generalized version, Grad-C
 ![alt tag](https://github.com/1202kbs/Understanding-NN/blob/master/assets/4_2_GCAM/DNN_2.png)
 
 
+### 4.3 Grad-CAM++
+
+![alt tag](https://github.com/1202kbs/Understanding-NN/blob/master/assets/4_3_GCAMPP/DNN_1.png)
+
+![alt tag](https://github.com/1202kbs/Understanding-NN/blob/master/assets/4_3_GCAMPP/DNN_2.png)
+
+
 ## 5 Quantifying Explanation Quality
 
 While each explanation technique is based on its own intuition or mathematical principle, it is also important to define at a more abstract level what are the characteristics of a good explanation, and to be able to test for these characteristics quantitatively. We present in Sections 5.1 and 5.2 two important properties of an explanation, along with possible evaluation metrics.
@@ -240,3 +249,7 @@ This tutorial requires [Tensorflow](https://www.tensorflow.org/), [NumPy](http:/
 #### Section 4.2
 
 [13] R. R.Selvaraju, A. Das, R. Vedantam, M. Cogswell, D. Parikh, and D. Batra. Grad-cam: Why did you say that? visual explanations from deep networks via gradient-based localization. arXiv:1611.01646, 2016.
+
+#### Section 4.3
+
+[14] A. Chattopadhyay, A. Sarkar, P. Howlader, and V. N. Balasubramanian. Grad-cam++: Generalized gradient-based visual explanations for deep convolutional networks. CoRR, abs/1710.11063, 2017.
diff --git a/assets/4_3_GCAMPP/DNN_1.png b/assets/4_3_GCAMPP/DNN_1.png
diff --git a/assets/4_3_GCAMPP/DNN_2.png b/assets/4_3_GCAMPP/DNN_2.png
diff --git a/models/models_4_3.py b/models/models_4_3.py
@@ -0,0 +1,48 @@
+from tensorflow.python.ops import nn_ops, gen_nn_ops
+import tensorflow as tf
+
+
+class MNIST_CNN:
+
+    def __init__(self, name):
+        self.name = name
+
+    def __call__(self, X, reuse=False):
+
+        with tf.variable_scope(self.name) as scope:
+
+            if reuse:
+                scope.reuse_variables()
+
+            with tf.variable_scope('layer0'):
+                X_img = tf.reshape(X, [-1, 40, 40, 1])
+
+            # Convolutional Layer #1 and Pooling Layer #1
+            with tf.variable_scope('layer1'):
+                conv1 = tf.layers.conv2d(inputs=X_img, filters=32, kernel_size=[3, 3], padding="SAME", activation=tf.nn.relu, use_bias=False)
+                pool1 = tf.layers.max_pooling2d(inputs=conv1, pool_size=[2, 2], padding="SAME", strides=2)
+
+            # Convolutional Layer #2 and Pooling Layer #2
+            with tf.variable_scope('layer2'):
+                conv2 = tf.layers.conv2d(inputs=pool1, filters=64, kernel_size=[3, 3], padding="SAME", activation=tf.nn.relu, use_bias=False)
+                pool2 = tf.layers.max_pooling2d(inputs=conv2, pool_size=[2, 2], padding="SAME", strides=2)
+
+            # Convolutional Layer #3 and Pooling Layer #3
+            with tf.variable_scope('layer3'):
+                conv3 = tf.layers.conv2d(inputs=pool2, filters=128, kernel_size=[3, 3], padding="SAME", activation=tf.nn.relu, use_bias=False)
+
+            # Dense Layer with Relu
+            with tf.variable_scope('layer4'):
+                flat = tf.reshape(conv3, [-1, 128 * 10 * 10])
+                dense4 = tf.layers.dense(inputs=flat, units=625, activation=tf.nn.relu, use_bias=False)
+
+            # Logits (no activation) Layer: L5 Final FC 625 inputs -> 10 outputs
+            with tf.variable_scope('layer5'):
+                logits = tf.layers.dense(inputs=dense4, units=10, activation=None, use_bias=False)
+                prediction = tf.nn.relu(logits)
+
+        return [X_img, conv1, pool1, conv2, pool2, conv3, flat, dense4, prediction], logits
+
+    @property
+    def vars(self):
+        return tf.get_collection(tf.GraphKeys.TRAINABLE_VARIABLES, scope=self.name)