This work has been accepted at IEEE INDICON 2021! It was carried out under the supervision of Professor Amlan Chakrabarti, Department of AKCSIT, University of Calcutta.
The paper is available here and the slides can be viewed here.
A. Clone the repository first.
git clone https://github.com/sarosijbose/A-Fusion-architecture-for-Human-Activity-Recognition.git
B. It is then recommended to create a fresh virtual environment.
python -m venv env
source activate env/bin/activate
Then install the required dependencies.
pip install -r requirements.txt
C. Directory structure overview
The codebase is divided into 5 folders.
- 3D CNN
This folder contains all the code necessary for running the Spatial 3D CNN Stream. First convert the sample UCF-101 videos given in thesample videos
folder into their required pre-processed format,
python pre_processing.py
This will convert the videos into the required .npy format. Next, feed them one-by-one into the evaluation code for the results
python eval3dcnn.py
Make sure that the i3d.py
file is present in the same directory and change the checkpoint path accordingly.
The Kinetics-RGB-600 checkpoint is avaliable here.
The top 5-predictions will be printed in the order of their prediction accuracy.
Here is a sample output for the video v_BrushingTeeth_g17_c02.npy
- 2D CNN
This folder contains the code for running the Spatial 2D CNN Stream.
Next, directly feed the frames by running this,
python eval2dcnn.py
Here is a sample output:- Download the entire pre-processed RGB data made avaliable by Feichtenhofer here if you want to fine-tune on UCF-101.
- Average the softmax scores of each stream.
python average_fusion.py
- Data
This folder contains all the required utilities and samples required for evaluation.
Please consider citing this work if you found it useful:-
@INPROCEEDINGS{9691648,
author={Bose, Sarosij and Chakrabarti, Amlan},
booktitle={2021 IEEE 18th India Council International Conference (INDICON)},
title={A Fusion Architecture Model for Human Activity Recognition},
year={2021},
volume={},
number={},
pages={1-6},
doi={10.1109/INDICON52576.2021.9691648}}
Parts of the codebase and the RGB-600 checkpoint have been adapted from the Kinetics repository. We are grateful to the authors for making their work available.