CVND_Udacity

Projects of Udacity Computer Vision Nanodegree

Project 1: Facial Keypoints Detection

This project aims to build a CNN model to detect facial keypoints in an image which are the points of 'interest' in a human face such as the corners of eyes and mouth.

The detection of facial keypoints allows building facial image manipulation applications.

Fig 1: Predicted facial Keypoints

Fig 2: Sample image manipulation using facial keypoints

Project 2: Image Captioning

The goal of this project is to develop a deep learning model to generate captions for images. This is done using a CNN - RNN architecture following the paper Show and Tell.

Fig 3: CNN RNN model

Image captioning can be used to provide verbal descriptions to partially/complete visually impaired people through a headset. It can also be used to build a query based image search engine without the need of manually annotated images.

Some sample captions generated by the trained model are shown below.

Fig 4: Generated Image Captions

Project 3: Landmark Detection and Tracking (SLAM)

The goal of this project is to do landmark detection and tracking by using simultaneous localization and mapping (SLAM) for a 2D world. For this, I have implemented graphSLAM.

Fig 5: Final location of the robot found using SLAM

Using the robot's sensor measurements, SLAM predicts the position of the robot and the landmarks in the world. Localizing the robot in real-time builds a map of the environment.

Extra Curricular Project: Code Optimization

The goal of this project is to optimize the C++ code of the 2D histogram filter. Code optimizations reduce the execution time of a program while also reducing the memory footprint, making it feasible to run the code on an embedded device or in real-time scenarios.

Execution time (in milliseconds) of the code is monitored by running every function for 10000 iterations. The best execution time achieved by the code is 16.877 milliseconds.

File Name	Original Problem Code execution time	Optimized Code execution time	Optimized Code execution with O3 GCC flag execution time
Initialize Beliefs	43.42	13.518	1.802
Sense	56.057	14.967	3.444
Blur	151.49	67.38	7.748
Normalize	56.39	13.157	1.573
Move	51.566	16.536	2.31
Total	358.923	125.558	16.877

Acknowledgement

Udacity Computer Vision Nanodegree

Author

Abhishek Tandon

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
Facial_Keypoints_Detection		Facial_Keypoints_Detection
Image_Captioning		Image_Captioning
SLAM		SLAM
optimized_code		optimized_code
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CVND_Udacity

Project 1: Facial Keypoints Detection

Fig 1: Predicted facial Keypoints

Fig 2: Sample image manipulation using facial keypoints

Project 2: Image Captioning

Fig 3: CNN RNN model

Fig 4: Generated Image Captions

Project 3: Landmark Detection and Tracking (SLAM)

Fig 5: Final location of the robot found using SLAM

Extra Curricular Project: Code Optimization

Acknowledgement

Author

About

Releases

Packages

Languages

Tandon-A/CVND_Udacity

Folders and files

Latest commit

History

Repository files navigation

CVND_Udacity

Project 1: Facial Keypoints Detection

Fig 1: Predicted facial Keypoints

Fig 2: Sample image manipulation using facial keypoints

Project 2: Image Captioning

Fig 3: CNN RNN model

Fig 4: Generated Image Captions

Project 3: Landmark Detection and Tracking (SLAM)

Fig 5: Final location of the robot found using SLAM

Extra Curricular Project: Code Optimization

Acknowledgement

Author

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages