Skip to content

Commit

Permalink
Merge pull request #10 from sony/feature/20171215-imagenet-example
Browse files Browse the repository at this point in the history
Feature/20171215 imagenet example
  • Loading branch information
TakuyaNarihira authored Apr 18, 2018
2 parents c0baae3 + f63d844 commit 59893dd
Show file tree
Hide file tree
Showing 22 changed files with 51,408 additions and 62 deletions.
2 changes: 1 addition & 1 deletion capsule_net/requirements.txt
Original file line number Diff line number Diff line change
@@ -1 +1 @@
nnabla>=0.9.7
nnabla>=0.9.9
2 changes: 1 addition & 1 deletion cifar10-100-collection/requirements.txt
Original file line number Diff line number Diff line change
@@ -1 +1 @@
nnabla>=0.9.5
nnabla>=0.9.9
2 changes: 1 addition & 1 deletion distributed/cifar10-100/requirements.txt
Original file line number Diff line number Diff line change
@@ -1 +1 @@
nnabla>=0.9.5
nnabla>=0.9.9
60 changes: 53 additions & 7 deletions imagenet-classification/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,25 +4,71 @@

## Overview

The examples are written in python. These examples demonstrate learning on Tiny ImageNet dataset.
The examples are written in python. These examples demonstrate learning on ImageNet and Tiny ImageNet dataset.
In case of ImageNet, We need to get the dataset and to create the cache file by yourself.
Tiny ImageNet dataset will be cached by running the example script.

---

## classification.py

In this example, "Residual Neural Network" (also called "ResNet") is trained on a [Tiny Imagenet](https://tiny-imagenet.herokuapp.com/) dataset.
In this example, "Residual Neural Network" (also called "ResNet") is trained on a [Imagenet](https://imagenet.herokuapp.com/) and [Tiny Imagenet](https://tiny-imagenet.herokuapp.com/) dataset.

The following line executes the Tiny ImageNet training (with the setting the we recommended you to try first. It requires near 6GB memory available in the CUDA device. See more options in the help by the `-h` option.).
### ImageNet

ImageNet consists of 1000 categories and each category has 1280 of images in training set.
The ImageNet dataset(training and validation) requires 150[GBytes] of disk capacity.
To create catche files requires approximately 400[GBytes] of disk capacity.

1. Prepare the data of ImageNet (You can get ImageNet dataset from the [link](https://imagenet.herokuapp.com/). The following setup procedure requires the following two files.
- Training dataset: ILSVRC2012_img_train.tar
- Validation dataset: ILSVRC2012_img_val.tar

2. Create a directory for the data set.
- For the trainning data.
- mkdir "directory name"
- [ex):mkdir train_data]
- python create_train_dir.py -t "tar file(trainning) of ImageNet" -o "directory name"
- [ex):python create_train_dir.py -t ILSVRC2012_img_train.tar -o train_data]
- For the validation data.
- mkdir "directory name"
- [ex):mkdir val_data]
- python create_val_dir.py -t "tar file(validation) of ImageNet" -o "directory name"
- [ex):python create_val_dir.py -t ILSVRC2012_img_val.tar -o val_data]

3. Create the cache files of the datasets that improve the disk I/O overhead.
- For the trainning data.
- mkdir "directory name"
- [ex):mkdir train_cache]
- python create_cache_file.py -i "directory of the trainning data" -o "directory of the trainning cache file" -w "width of output image" -g "height of output image" -m "shaping mode (trimming or padding)" -s "shuffle mode (true or false)"
- [ex):[python create_cache_file.py -i train_data -o train_cache -w 320 -g 320 -m trimming -s true]
- For the validation data.
- mkdir "directory name"
- [ex):mkdir val_cache]
- python create_cache_file.py -i "directory of the validation data" -o "directory of the validatio cache file" -w "width of output image" -g "height of output image" -m "shaping mode (trimming or padding)" -s "shuffle mode (true or false)"
- [ex):python create_cache_file.py -i val_data -o val_cache -w 320 -g 320 -m trimming -s false]

The following line executes the ImageNet training (See more options in the help by the `-h` option.).

```
python classification.py -c cudnn -a4 -b64 -L34
4.Execute the example of ImageNet.
- python classification.py -c "device id" -b"batch size" -a"accumulate gradient" -L"number of layers", -T "directory of the trainning cache file" -V "directory of the validation cache file"
[ex):python classification.py -c cudnn -b64 -a4 -L34 -T train_cache -V val_cache]
```

Tiny ImageNet consists of 200 categories and each category has 500 of 64x64 size images in training set.
After the learning completes successfully, the results will be saved in "tmp.montors.imagenet".

In this folder you will find model files "\*.h5" and result files "\*.txt"

### Tiny Imagenet

The training script for ImageNet also works on Tiny ImageNet dataset, which
consists of 200 categories and each category has 500 of 64x64 size images in training set.
The ResNet trained here is almost equivalent to the one used in ImageNet.
The differences are the strides in both the first conv and the max pooling are removed.

After the learning completes successfully, the results will be saved in "tmp.montors.imagenet".
The following line executes the Tiny ImageNet training (with the setting the we recommended you to try first. It requires near 6GB memory available in the CUDA device. See more options in the help by the `-h` option.).

In this folder you will find model files "\*.h5" and result files "\*.txt"
```
python classification.py -c cudnn -a4 -b64 -L34 -M true
```
8 changes: 7 additions & 1 deletion imagenet-classification/args.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@
# limitations under the License.


def get_args(monitor_path='tmp.monitor.imagenet', max_iter=500000, model_save_path=None, learning_rate=1e-1, batch_size=8, weight_decay=1e-4, accum_grad=32):
def get_args(monitor_path='tmp.monitor.imagenet', max_iter=500000, model_save_path=None, learning_rate=1e-1, batch_size=8, weight_decay=1e-4, accum_grad=32, tiny_mode=False, train_cachefile_dir=None, val_cachefile_dir=None):
"""
Get command line arguments.
Expand Down Expand Up @@ -63,6 +63,12 @@ def get_args(monitor_path='tmp.monitor.imagenet', max_iter=500000, model_save_pa
parser.add_argument("--shortcut-type", "-S", type=str,
choices=['b', 'c', ''], default='b',
help='Skip connection type. See `resnet_imagenet()` in model_resenet.py for description.')
parser.add_argument("--tiny-mode", "-M", type=bool, default=tiny_mode,
help='The dataset is tiny imagenet.')
parser.add_argument("--train-cachefile-dir", "-T", type=str, default=train_cachefile_dir,
help='Training cache file dir. Create to use create_cache_file.py')
parser.add_argument("--val-cachefile-dir", "-V", type=str, default=val_cachefile_dir,
help='Validation cache file dir. Create to use create_cache_file.py')
args = parser.parse_args()
if not os.path.isdir(args.model_save_path):
os.makedirs(args.model_save_path)
Expand Down
Loading

0 comments on commit 59893dd

Please sign in to comment.