Skip to content

Design of a deep neural network model to predict the class of an image and the corresponding bounding box coordinates with Keras.

Notifications You must be signed in to change notification settings

AlessandroGhiotto/CNN-image-classification

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CNN image classification

We are given a dataset containing $186$ images which are categorized into three classes, depending on the main object they contain. The following inputs are available:

  1. $\textit{input image}$, a $227 \times 227 \times 3$ real-valued tensor. The last dimension denotes the number of input channels $(RGB)$. Each pixel is a value in $[0, 255]$;
  2. $\textit{image label}$, integer in the set ${1, 2, 3}$ denoting one of three possible objects an image can contain. There are $63$, $62$, and $61$ images belonging to the class $1$, $2$ and $3$, respectively;
  3. $\textit{image bounding box}$, four integers $x_1$, $y_1$, $x_2$, $y_2$ in the range $1-227$, where $(x_1, y_1)$ and $(x_2, y_2)$ are the bottom-left and the top-right corners of the box containing the object, respectively.

Design a deep neural network model (with Keras) to predict the class of an image along with the corresponding bounding box coordinates.

Description

Data

Here we can take a quick look at the three different classes of the images in the data:

classes

Model

The model is constituted by a convolutional part made of inception blocks, followed by two classification heads. The two heads have the following output layers:

  • 'output_label' : Dense(n_classes, activation='softmax') $\rightarrow$ probability distribution over the $3$ classes
  • 'output_bbox' : Dense(4, activation='sigmoid') $\rightarrow$ predict a value in $[0, 1]$ for each coordinate of the corners of the bounding box

classes

Result

Here we can visualise some results. The dashed red bbox is the one predicted by the model, the green one is the true bbox.

classes

classes

About

Design of a deep neural network model to predict the class of an image and the corresponding bounding box coordinates with Keras.

Resources

Stars

Watchers

Forks