-
Notifications
You must be signed in to change notification settings - Fork 334
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #76 from sony/feature/20180131-deeplabv3plus
Add Deeplab v3+ training and inference
- Loading branch information
Showing
25 changed files
with
1,778 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,213 @@ | ||
# Neural Network Libraries - Examples | ||
|
||
Installation guide given in : https://github.com/sony/nnabla-examples/ | ||
|
||
# Setup Requirements | ||
|
||
In Linux environment: | ||
1. OpenCV | ||
``` | ||
conda install opencv | ||
``` | ||
2. ImageIO | ||
``` | ||
pip install imageio | ||
``` | ||
3. Tensorflow | ||
``` | ||
pip install tensorflow | ||
``` | ||
|
||
|
||
# Some segmentation results on VOC validation images: | ||
<p align="center"> | ||
<img src="results/test3.jpg" width=200 height=250> <img src="results/nn_test3.png" width=200 height=250> </br> | ||
<img src="results/test4.jpg" width=200 height=250> <img src="results/nn_test4.png" width=200 height=250> </br> | ||
<pre class="tab"></pre><img src="results/test5.jpg" width=200 height=250> <img src="results/nn_test5.png" width=200 height=250> </br> | ||
<img src="results/test7.jpg" width=300 height=200> <img src="results/nn_test7.png" width=300 height=200> </br> | ||
</p> | ||
|
||
|
||
|
||
# Quick Start Inference | ||
|
||
To perform inference on a test image perform the following steps: | ||
|
||
|
||
## Download pretrained model | ||
|
||
To download a pretrained model from tensorflow's repository trained on [COCO+VOC trainaug dataset](https://github.com/tensorflow/models/blob/master/research/deeplab/g3doc/model_zoo.md) : | ||
```bash | ||
python download_pretrained_tf_deeplabv3plus_coco_voc_trainaug.py | ||
``` | ||
|
||
This will download and uncompress the pretrained tensorflow model trained on COCO+Pascal VOC 2012 trainaug dataset with xception as backbone. | ||
|
||
## Weight Conversion | ||
|
||
Convert tensorflow weight/checkpoints to nnabla parameters file. | ||
```bash | ||
python convert_tf_nnabla.py --input-ckpt-file=/path to ckpt file --output-nnabla-file=/output.h5 file | ||
``` | ||
|
||
##### NOTE: input-ckpt-file is the path to the ckpt file downloaded and uncompressed in the previous step. Please give the path to the model-*.ckpt. | ||
|
||
|
||
## Inference | ||
|
||
Perform inference on a test image using the converted Tensorflow model. | ||
```bash | ||
python model_inference.py --model-load-path=/path to parameter file --image-width=target width for input image --test-image-file=image file for inference --num-class=no. of categories --label-file-path=txt file having categories --output-stride=16 | ||
``` | ||
|
||
##### NOTE: model-load-path is the path to the converted parameter filr(.h5) obtained in the previous step. | ||
##### and num-class=21 in the case of using the default tensorflow pretrained model downloaded in the "Download pretrained model" step (as it is trained on 21 categories). | ||
|
||
|
||
|
||
|
||
# Training | ||
|
||
To train a Deeplab v3+ model in Nnabla, perform the following steps: | ||
|
||
## Download Dataset | ||
|
||
Support for the following dataset is provided: | ||
|
||
##### VOC 2012 Semantic Segmentation dataset | ||
|
||
Download the Pascal VOC dataset and uncompress it. | ||
```bash | ||
wget host.robots.ox.ac.uk/pascal/VOC/voc2012/VOCtrainval_11-May-2012.tar | ||
``` | ||
|
||
## Run the data preparation script | ||
|
||
- VOC 2012 Semantic Segmentation Dataset | ||
|
||
To prepare VOC data for training: | ||
```bash | ||
python dataset_utils.py --train-file="" --val-file="" --data-dir="" | ||
``` | ||
##### NOTE: train-file and val-file are the filenames of the train and val images provided by VOC under VOC2012/ImageSets/Segmentation/ respectively; data-dir is the path that contains the JPEGImages (e.g: --data-dir = =../../VOCdevkit/VOC2012/ ) | ||
##### This will result in files train_images.txt, train_labels.txt, val_images.txt, val_labels.txt | ||
|
||
After data preparation, the data directory structure should look like : | ||
``` | ||
+ VOCdevkit | ||
+ VOC2012 | ||
+ JPEGImages | ||
+ SegmentationClass | ||
+ encoded | ||
``` | ||
and the current working directory should contain the following 4 files generated from running the above script: | ||
``` | ||
+ train_image.txt | ||
+ train_label.txt | ||
+ val_image.txt | ||
+ val_label.txt | ||
``` | ||
|
||
|
||
## Download backbone pre-trained model from Tensorflow's Model Zoo | ||
|
||
```bash | ||
wget download.tensorflow.org/models/deeplabv3_xception_2018_01_04.tar.gz | ||
``` | ||
|
||
|
||
## Convert the backbone pre-trained model to Nnabla | ||
|
||
```bash | ||
python convert_tf_nnabla.py --input-ckpt-file=/path to ckpt file --output-nnabla-file=/output .h5 file | ||
``` | ||
|
||
|
||
## Run the training script | ||
|
||
To run the training: | ||
|
||
##### Single Process Training | ||
|
||
```bash | ||
python train.py \ | ||
--train-dir=train_image.txt \ | ||
--train-label-dir=train_label.txt \ | ||
--val-dir=val_image.txt\ | ||
--val-label-dir=val_label.txt \ | ||
--accum-grad=1 \ | ||
--warmup-epoch=5 \ | ||
--max-iter=40000 \ | ||
--model-save-interval=1000 \ | ||
--model-save-path=/path to save model \ | ||
--val-interval=1000 \ | ||
--batch-size=1 \ | ||
--num-class= no of categories in dataset \ | ||
--pretrained-model-path=path to the pretrained model(.h5) \ | ||
--train-samples=no. of train samples in dataset \ | ||
--val-samples=no. of val samples in dataset \ | ||
``` | ||
|
||
|
||
##### Distributed Training | ||
For distributed binary installation refer : https://nnabla.readthedocs.io/en/latest/python/pip_installation_cuda.html#installation-with-multi-gpu-supported | ||
|
||
|
||
```bash | ||
mpirun -n <no. of devices> python train.py \ | ||
--train-dir=train_image.txt \ | ||
--train-label-dir=train_label.txt \ | ||
--val-dir=val_image.txt\ | ||
--val-label-dir=val_label.txt \ | ||
--accum-grad=1 \ | ||
--warmup-epoch=5 \ | ||
--max-iter=40000 \ | ||
--model-save-interval=1000 \ | ||
--model-save-path=/path to save model \ | ||
--val-interval=1000 \ | ||
--batch-size=1 \ | ||
--num-class= no of categories in dataset \ | ||
--pretrained-model-path=path to the pretrained model(.h5) \ | ||
--train-samples=no. of train samples in dataset \ | ||
--val-samples=no. of val samples in dataset \ | ||
--distributed | ||
``` | ||
|
||
##### Fine Tuning | ||
For fine-tuning with any dataset, prepare the dataset in the same way VOC dataset is prepared(writing a data preparation script may be required--refer dataset_utils.py) and add --fine-tune argument. | ||
|
||
##### NOTE: | ||
1. The text files passed as arguments to the training scripts are the ones generated in the "Run the data preparation script" Step. | ||
2. For reproducing paper results, it is suggested to use batch-size > 16 (for distributed, set argument --batch-size = 16 / no.of devices) and max-iter=250,000 when training from scratch. | ||
3. To compute the accuracy (mean IOU) while training/validation add argument --compute-acc to the training command. | ||
|
||
##### Typical Training Loss curve: | ||
<p align="center"> | ||
<img src="results/Train-loss.png" width=600 height=350></br> | ||
</p> | ||
|
||
|
||
## Evaluate | ||
|
||
To evaluate the trained model obtained from the previous step : | ||
|
||
```bash | ||
python eval.py \ | ||
--model-load-path=/model_save_path/param_xxx.h5 \ | ||
--val-samples=no. of val samples in dataset \ | ||
--val-dir=val_image.txt \ | ||
--val-label-dir=val_label.txt \ | ||
--batch-size=1 \ | ||
-c='cudnn' or 'cpu' \ | ||
--num-class=no. of categories | ||
``` | ||
|
||
## Inference | ||
|
||
Perform inference on a test image using the trained model. | ||
|
||
```bash | ||
python model_inference.py --model-load-path=/path to parameter file(.h5) --image-width=target width for input image --test-image-file=image file for inference --num-class=no. of categories --label-file-path=txt file having categories --output-stride=16 | ||
``` | ||
|
||
##### NOTE: model-load-path is the path to the converted parameter filr(.h5) obtained in training. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,119 @@ | ||
# Copyright (c) 2017 Sony Corporation. All Rights Reserved. | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
|
||
def get_args(monitor_path='tmp.monitor', max_iter=10000, model_save_path=None, learning_rate=1e-3, batch_size=128, weight_decay=1e-4, description=None): | ||
""" | ||
Get command line arguments. | ||
Arguments set the default values of command line arguments. | ||
""" | ||
import argparse | ||
import os | ||
if model_save_path is None: | ||
model_save_path = monitor_path | ||
if description is None: | ||
description = "Examples on data iterator examples. The following help shared among examples in this folder. Some arguments are valid or invalid in some examples." | ||
parser = argparse.ArgumentParser(description) | ||
parser.add_argument('--fine-tune', action='store_true', | ||
default=False, help="Whether to fine tune model or not; False by default") | ||
parser.add_argument('--distributed', action='store_true', | ||
default=False, help="Whether to use distributed/single gpu training; False by default") | ||
parser.add_argument('--compute-acc', action='store_true', | ||
default=False, help="Whether to compute the accuracy mean IOU value during training/validation; False by default") | ||
parser.add_argument("--input-ckpt-file", type=str) | ||
parser.add_argument("--output-nnabla-file", type=str, default='deeplab_nnabla.h5') | ||
parser.add_argument("--batch-size", "-b", type=int, default=batch_size) | ||
parser.add_argument("--label-path", type=str) | ||
parser.add_argument("--data-dir",type=str,help='Path to VOC datset.') | ||
parser.add_argument("--train-file",type=str, | ||
help='VOC train split text file') | ||
parser.add_argument("--val-file", | ||
type=str, help='VOC val split text file') | ||
parser.add_argument("--train-dir", "-t", | ||
type=str, default=model_save_path, | ||
help='Path to training data.') | ||
parser.add_argument("--val-dir", "-v", | ||
type=str, default=model_save_path, | ||
help='Path to validation data.') | ||
parser.add_argument("--train-label-dir", | ||
type=str, default=model_save_path, | ||
help='Path to training data-labels.') | ||
parser.add_argument("--val-label-dir", | ||
type=str, default=model_save_path, | ||
help='Path to validation data-labels.') | ||
parser.add_argument("--learning-rate", "-l", | ||
type=float, default=learning_rate) | ||
parser.add_argument("--output-stride", | ||
type=int, default=16) | ||
parser.add_argument("--monitor-path", "-m", | ||
type=str, default=monitor_path, | ||
help='Path monitoring logs saved.') | ||
parser.add_argument("--max-iter", "-i", type=int, default=max_iter, | ||
help='Max iteration of training.') | ||
parser.add_argument("--val-interval", type=int, default=100, | ||
help='Validation interval.') | ||
parser.add_argument("--val-iter", "-j", type=int, default=10, | ||
help='Each validation runs `val_iter mini-batch iteration.') | ||
parser.add_argument("--accum-grad", | ||
type=int, default=32, | ||
help='Weight decay factor of SGD update.') | ||
parser.add_argument("--weight-decay", "-w", | ||
type=float, default=weight_decay, | ||
help='Weight decay factor of SGD update.') | ||
parser.add_argument("--warmup-epoch", type=int, default=5) | ||
parser.add_argument("--device-id", "-d", type=str, default='0', | ||
help='Device ID the training run on. This is only valid if you specify `-c cuda.cudnn`.') | ||
parser.add_argument("--type-config", type=str, default='float', | ||
help='Type of computation. e.g. "float", "half".') | ||
parser.add_argument("--model-save-interval", "-s", type=int, default=1000, | ||
help='The interval of saving model parameters.') | ||
parser.add_argument("--model-save-path", "-o", | ||
type=str, default=model_save_path, | ||
help='Path the model parameters saved.') | ||
parser.add_argument("--pretrained-model-path", | ||
type=str, default=model_save_path, | ||
help='Path the pretrained model parameters saved.') | ||
parser.add_argument("--net", "-n", type=str, | ||
default='lenet', | ||
help="Neural network architecure type (used only in classification*.py).\n classification.py: ('lenet'|'resnet'), classification_bnn.py: ('bincon'|'binnet'|'bwn'|'bwn'|'bincon_resnet'|'binnet_resnet'|'bwn_resnet')") | ||
parser.add_argument('--context', '-c', type=str, | ||
default='cpu', help="Extension modules. ex) 'cpu', 'cudnn'.") | ||
parser.add_argument('--augment-train', action='store_true', | ||
default=False, help="Enable data augmentation of training data.") | ||
parser.add_argument('--augment-test', action='store_true', | ||
default=False, help="Enable data augmentation of testing data.") | ||
parser.add_argument('--channel', default=1, type=int) | ||
parser.add_argument('--image-width', default=28, type=int) | ||
parser.add_argument('--image-height', default=28, type=int) | ||
parser.add_argument('--dataset-path', type=str) | ||
parser.add_argument("--model-load-path", "-T", | ||
type=str, default=model_save_path, | ||
help='Path the model parameters loaded.') | ||
parser.add_argument('--label-file-path', type=str) | ||
parser.add_argument('--test-image-file', type=str) | ||
parser.add_argument('--num-class', default=10, type=int) | ||
parser.add_argument('--train-samples', default=10, type=int) | ||
parser.add_argument('--val-samples', default=10, type=int) | ||
parser.add_argument("--sync-weight-every-itr", | ||
type=int, default=100, | ||
help="Sync weights every specified iteration. NCCL uses\ | ||
the ring all reduce, so gradients in each device are not exactly same. When it\ | ||
is accumulated in the weights, the weight values in each device diverge.") | ||
|
||
|
||
args = parser.parse_args() | ||
if not os.path.isdir(args.model_save_path): | ||
os.makedirs(args.model_save_path) | ||
return args |
Oops, something went wrong.