Skip to content

Commit

Permalink
first commit
Browse files Browse the repository at this point in the history
  • Loading branch information
Antonio Loquercio committed Nov 18, 2021
0 parents commit 818af9a
Show file tree
Hide file tree
Showing 23 changed files with 2,975 additions and 0 deletions.
117 changes: 117 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,117 @@
# Event-based Vision meets Deep Learning on Steering Prediction for Self-driving Cars

This repository contains a deep learning approach to unlock the potential of event cameras on the prediction of a vehicles's steering angle.

#### Citing

If you use this code in an academic context, please cite the following publication:

Paper: [Event-based vision meets deep learning on steering prediction for self-driving cars](http://rpg.ifi.uzh.ch/docs/CVPR18_Maqueda.pdf)

Video: [YouTube](https://www.youtube.com/watch?v=_r_bsjkJTHA&feature=youtu.be)

```
@inproceedings{maqueda_2018,
title={Event-based vision meets deep learning on steering prediction for self-driving cars},
author={Maqueda, Ana I and Loquercio, Antonio and Gallego, Guillermo and Garc{\i}a, Narciso and Scaramuzza, Davide},
booktitle={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},
pages={5419--5427},
year={2018}
}
```

## Introduction

Steering angle prediction with standard cameras is not robust to scenes characterized by high dynamic range (HDR), motion blur, and low light. Event cameras, however, are bioinspired sensors that are able to solve all three problems at once. They output a stream of asynchronous events that are generated by moving edges in the scene. Their natural response to motion, and their advantages over traditional cameras (very high temporal resolution, very high dynamic range, and low latency) make them a perfect fit for the steering prediction task, which is addressed by a DL-based solution from a regression viewpoint.


### Model

A series of ResNet architectures, i.e., ResNet18 and ResNet50, have been deployed to carry out the steering prediction task. They are used as feature extractors, considering only their convolutional layers. Next, a global average pooling (GAP) layer is used to encode the image features into a vectorized descriptor that feeds a fully-connected (FC) layer (256-dimensional for ResNet18 and 1024-dimensional for ResNet50). This FC layer is followed by a ReLU non-linearity, and the final 1-dimensional FC layer to output the predicted steering angle.

![architecture](images/architecture.png)


### Data

In order to learn steering angles from event images, the publicly available [DAVIS Driving Dataset 2017 (DDD17)](https://docs.google.com/document/d/1HM0CSmjO8nOpUeTvmPjopcBcVCk7KXvLUuiZFS6TWSg/pub) has been used. It provides approximately 12 hours of annotated driving recordings collected by a car under different road, weather, and illumination conditions. The dataset includes asynchronous events as well as synchronous grayscale frames.

![architecture](images/input_data.png)


## Running the code

### Software requirements

This code has been tested on Ubuntu 14.04, and on Python 3.4.

Dependencies:
- Tensorflow
- Keras 2.1.4
- NumPy
- OpenCV
- scikit-learn
- Python gflags


### Data preparation

Please follow the instructions from the [DDD17 site](https://docs.google.com/document/d/1HM0CSmjO8nOpUeTvmPjopcBcVCk7KXvLUuiZFS6TWSg/pub), to download the dataset and visualize the HDF5 file contents. After that, you should get the following structure:

```
DDD17/
run1_test/
run2/
run3/
run4/
run5/
```

Authors also provide some [code](https://code.ini.uzh.ch/jbinas/ddd17-utils) for viewing and exporting the data. Download the repository and copy the files within the ```data_preprocessing``` directory.

Asynchronous events are converted into synchronous event frames by pixel-wise accumulation over a constant time interval, using separate channels for positive and negative events. To prepare the data in the format required by our implementation, follow these steps:


#### 1. Accumulate events

Run ```data_preprocessing/reduce_to_frame_based.py``` to reduce data to frame-based representation. The output is another HDF5 file, containing the frame-based data as a result of accumulating the events every other ```binsize``` seconds. The created HDF5 file will contain two new fields:
- **dvs_frame**: event frames (a 4-tensor, with number_of_frames x width x height x 2 elements).
- **aps_frame**: grayscales frames (a 3-tensor, with number_of_frames x width x height).

```
python data_preprocessing/reduce_to_frame_based.py --binsize 0.050 --update_prog_every 10 --keep_events 1 --source_folder ../DDD17 --dest_folder ../DDD17/DAVIS_50ms
```

Note: the ```reduce_to_frame_based.py``` script is the original ```export.py``` provided by the authors, which has been modified in order to compute several HDF5 files from a source directory, and save positive and negative event frames by separately.



#### 2. Split recordings

Run ```data_preprocessing/split_recordings.py``` to split the recordings into consecutive and non-overlapping short sequences of a few seconds each. Subsets of these sequences are used for training and testing, respectively. In particular, we set training sequences to 40 sec, and testing sequences to 20 sec.

```
python data_preprocessing/split_recordings.py --source_folder ./DDD17/DAVIS_50ms --rewrite 1 --train_per 40 --test_per 20
```

Note: the ```split_recordings.py``` is the original ```prepare_cnn_data.py``` provided by the authors, which has been modified in order to compute several HDF5 files from a source directory, and avoid frame pre-processing.



#### 3. Compute percentiles

Run ```data_preprocessing/compute_percentiles.py``` to compute some percentiles from DVS/event frames in order to remove outliers, and normalize them.

```
python data_preprocessing/compute_percentiles.py --source_folder ./DDD17/DAVIS_50ms --inf_pos_percentile 0.0 --sup_pos_percentile 0.9998 --inf_neg_percentile 0.0 --sup_neg_percentile 0.9998
```


#### 4. Export CNN data

Run ```data_preprocessing/export_cnn_data.py``` to export DVS/event frames, APS/grayscale frames, difference of grayscale frames (APS diff) in PNG format, and text files with steering angles form the HDF5 files to be used by the network.

```
python export_cnn_data.py --source_folder ./DDD17/DAVIS_50ms
```
204 changes: 204 additions & 0 deletions cnn.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,204 @@
import tensorflow as tf
import numpy as np
import os
import sys
import gflags

from keras.callbacks import ModelCheckpoint
from keras import backend as K
import keras

import logz
import cnn_models
import utils
import log_utils
from common_flags import FLAGS
from constants import TRAIN_PHASE



def getModel(img_width, img_height, img_channels, output_dim, weights_path):
"""
Initialize model.
# Arguments
img_width: Target image widht.
img_height: Target image height.
img_channels: Target image channels.
output_dim: Dimension of model output.
weights_path: Path to pre-trained model.
# Returns
model: A Model instance.
"""
if FLAGS.imagenet_init:
model = cnn_models.resnet50(img_width,
img_height, img_channels, output_dim)
else:
model = cnn_models.resnet50_random_init(img_width,
img_height, img_channels, output_dim)


if weights_path:
#try:
model.load_weights(weights_path)
print("Loaded model from {}".format(weights_path))
#except:
# print("Impossible to find weight path. Returning untrained model")

return model


def trainModel(train_data_generator, val_data_generator, model, initial_epoch):
"""
Model training.
# Arguments
train_data_generator: Training data generated batch by batch.
val_data_generator: Validation data generated batch by batch.
model: Target image channels.
initial_epoch: Dimension of model output.
"""

# Initialize number of samples for hard-mining
model.k_mse = tf.Variable(FLAGS.batch_size, trainable=False, name='k_mse', dtype=tf.int32)

# Configure training process
optimizer = keras.optimizers.Adam(lr=FLAGS.initial_lr, decay=1e-4)
model.compile(loss=[utils.hard_mining_mse(model.k_mse)], optimizer=optimizer,
metrics=[utils.steering_loss, utils.pred_std])

# Save model with the lowest validation loss
weights_path = os.path.join(FLAGS.experiment_rootdir, 'weights_{epoch:03d}.h5')
writeBestModel = ModelCheckpoint(filepath=weights_path, monitor='val_steering_loss',
save_best_only=True, save_weights_only=True)

# Save model every 'log_rate' epochs.
# Save training and validation losses.
logz.configure_output_dir(FLAGS.experiment_rootdir)
saveModelAndLoss = log_utils.MyCallback(filepath=FLAGS.experiment_rootdir,
period=FLAGS.log_rate,
batch_size=FLAGS.batch_size,
factor=FLAGS.lr_scale_factor)

# Train model
steps_per_epoch = np.minimum(int(np.ceil(
train_data_generator.samples / FLAGS.batch_size)), 2000)
validation_steps = int(np.ceil(val_data_generator.samples / FLAGS.batch_size))-1

model.fit_generator(train_data_generator,
epochs=FLAGS.epochs, steps_per_epoch = steps_per_epoch,
callbacks=[writeBestModel, saveModelAndLoss],
validation_data=val_data_generator,
validation_steps = validation_steps,
initial_epoch=initial_epoch)


def _main():

# Set random seed
if FLAGS.random_seed:
seed = np.random.randint(0,2*31-1)
else:
seed = 5
np.random.seed(seed)
tf.set_random_seed(seed)

K.set_learning_phase(TRAIN_PHASE)

# Create the experiment rootdir if not already there
if not os.path.exists(FLAGS.experiment_rootdir):
os.makedirs(FLAGS.experiment_rootdir)

# Input image dimensions
img_width, img_height = FLAGS.img_width, FLAGS.img_height

# Cropped image dimensions
crop_img_width, crop_img_height = FLAGS.crop_img_width, FLAGS.crop_img_height

# Output dimension (one for steering)
output_dim = 1

# Input image channels
# - DVS frames: 2 channels (first one for positive even, second one for negative events)
# - APS frames: 1 channel (grayscale images)
# - APS DIFF frames: 1 channel (log(I_1) - log(I_0))
if FLAGS.frame_mode == 'dvs':
img_channels = 3
else:
img_channels = 3


# Generate training data with real-time augmentation
if FLAGS.frame_mode == 'dvs':
train_datagen = utils.DroneDataGenerator()
elif FLAGS.frame_mode == 'aps':
train_datagen = utils.DroneDataGenerator(rotation_range = 0.2,
rescale = 1./255,
width_shift_range = 0.2,
height_shift_range=0.2)
else:
train_datagen = utils.DroneDataGenerator(rotation_range = 0.2,
width_shift_range = 0.2,
height_shift_range=0.2)

train_generator = train_datagen.flow_from_directory(FLAGS.train_dir,
is_training=True,
shuffle = True,
frame_mode = FLAGS.frame_mode,
target_size=(img_height, img_width),
crop_size=(crop_img_height, crop_img_width),
batch_size = FLAGS.batch_size)

# Generate validation data with real-time augmentation
if FLAGS.frame_mode == 'dvs' or FLAGS.frame_mode == 'aps_diff':
val_datagen = utils.DroneDataGenerator()
else:
val_datagen = utils.DroneDataGenerator(rescale = 1./255)

val_generator = val_datagen.flow_from_directory(FLAGS.val_dir,
shuffle = False,
frame_mode = FLAGS.frame_mode,
target_size=(img_height, img_width),
crop_size=(crop_img_height, crop_img_width),
batch_size = FLAGS.batch_size)
# output dim
assert train_generator.output_dim == val_generator.output_dim, \
" Not macthing output dimensions."
output_dim = train_generator.output_dim

# Weights to restore
weights_path = os.path.join(FLAGS.experiment_rootdir, FLAGS.weights_fname)
initial_epoch = 0
if not FLAGS.restore_model:
# In this case weights will start from random
weights_path = None
else:
# In this case weigths will start from the specified model
initial_epoch = FLAGS.initial_epoch

# Define model
model = getModel(img_width, img_height, img_channels,
output_dim, weights_path)

# Serialize model into json
json_model_path = os.path.join(FLAGS.experiment_rootdir, FLAGS.json_model_fname)
utils.modelToJson(model, json_model_path)

# Train model
trainModel(train_generator, val_generator, model, initial_epoch)


def main(argv):
# Utility main to load flags
try:
argv = FLAGS(argv) # parse flags
except gflags.FlagsError:
print ('Usage: %s ARGS\\n%s' % (sys.argv[0], FLAGS))

sys.exit(1)
_main()


if __name__ == "__main__":
main(sys.argv)
Loading

0 comments on commit 818af9a

Please sign in to comment.