read-the-room Project to research on how to train models from video data sets to get information about the action that is happening