GitHub - aiff22/PyNET-PyTorch: Generating RGB photos from RAW image files with PyNET (PyTorch)

Replacing Mobile Camera ISP with a Single Deep Learning Model

1. Overview [Paper] [TensorFlow Implementation] [Project Webpage]

This is an alternative PyTorch implementation of the paper. The original codes and pre-trained models can be found here.

This repository provides PyTorch implementation of the RAW-to-RGB mapping approach and PyNET CNN presented in this paper. The model is trained to convert RAW Bayer data obtained directly from mobile camera sensor into photos captured with a professional Canon 5D DSLR camera, thus replacing the entire hand-crafted ISP camera pipeline. The provided pre-trained PyNET model can be used to generate full-resolution 12MP photos from RAW (DNG) image files captured using the Sony Exmor IMX380 camera sensor. More visual results of this approach for the Huawei P20 and BlackBerry KeyOne smartphones can be found here.

2. Prerequisites

Python: scipy, numpy, imageio and pillow packages
PyTorch + TorchVision libraries
Nvidia GPU

3. First steps

Download the pre-trained PyNET model ^{(PSNR: 21.17, MS-SSIM: 0.8623)} and put it into models/original/ folder.
Download Zurich RAW to RGB mapping dataset and extract it into raw_images/ folder.
_{This folder should contain three subfolders: train/, test/ and full_resolution/}

_{Please note that Google Drive has a quota limiting the number of downloads per day. To avoid it, you can login to your Google account and press "Add to My Drive" button instead of a direct download. Please check this issue for more information.}

4. PyNET CNN

PyNET architecture has an inverted pyramidal shape and is processing the images at five different scales (levels). The model is trained sequentially, starting from the lowest 5th layer, which allows to achieve good reconstruction results at smaller image resolutions. After the bottom layer is pre-trained, the same procedure is applied to the next level till the training is done on the original resolution. Since each higher level is getting upscaled high-quality features from the lower part of the model, it mainly learns to reconstruct the missing low-level details and refines the results. In this work, we are additionally using one upsampling convolutional layer (Level 0) on top of the model that upscales the image to its target size.

Compared to the original TensorFlow model, this implementation contains three major modifications:

Instance normalization is used in PyNET's level 1.
Transposed convolutional layers are replaced with upsampling convolution.
Modified weight coefficients of the loss functions.

5. Training the model

The model is trained level by level, starting from the lowest (5th) one:

python train_model.py level=<level>

Obligatory parameters:

level: 5, 4, 3, 2, 1, 0

Optional parameters and their default values:

batch_size: 50 - batch size [small values can lead to unstable training]
learning_rate: 5e-5 - learning rate
restore_epoch: None - epoch to restore (when not specified, the last saved model for PyNET's level+1 is loaded)
num_train_epochs: 8, 8, 17, 17, 25, 50 (for levels 5 - 0) - the number of training epochs
dataset_dir: raw_images/ - path to the folder with Zurich RAW to RGB dataset

Below we provide the commands used for training the model on four Nvidia Tesla V100 GPUs, each one with 16GB of RAM. When using GPUs with a smaller total amount of memory, the batch size should be adjusted accordingly:

python train_model.py level=5 batch_size=50 num_train_epochs=8
python train_model.py level=4 batch_size=50 num_train_epochs=8
python train_model.py level=3 batch_size=50 num_train_epochs=17
python train_model.py level=2 batch_size=50 num_train_epochs=17
python train_model.py level=1 batch_size=16 num_train_epochs=25
python train_model.py level=0 batch_size=12 num_train_epochs=50

6. Test the provided pre-trained models on full-resolution RAW image files

python test_model.py level=0 orig=true

Optional parameters:

use_gpu: true,false - run the model on GPU or CPU
dataset_dir: raw_images/ - path to the folder with Zurich RAW to RGB dataset

7. Test the obtained model on full-resolution RAW image files

python test_model.py level=<level>

Obligatory parameters:

level: 5, 4, 3, 2, 1, 0

Optional parameters:

restore_epoch: None - epoch to restore (when not specified, the last saved model for level=<level> is loaded)
use_gpu: true,false - run the model on GPU or CPU
dataset_dir: raw_images/ - path to the folder with Zurich RAW to RGB dataset

8. Folder structure

models/ - logs and models that are saved during the training process
models/original/ - the folder with the provided pre-trained PyNET model
raw_images/ - the folder with Zurich RAW to RGB dataset
results/ - visual image results saved while training
results/full-resolution/ - visual results for full-resolution RAW image data saved during the testing

load_dataset.py - python script that loads training data
model.py - PyNET implementation (PyTorch)
train_model.py - implementation of the training procedure
test_model.py - applying the pre-trained model to full-resolution test images
utils.py - auxiliary functions
vgg.py - loading the pre-trained vgg-19 network

9. Bonus files

These files can be useful for further experiments with the model / dataset:

dng_to_png.py - convert raw DNG camera files to PyNET's input format
evaluate_accuracy.py - compute PSNR and MS-SSIM scores on Zurich RAW-to-RGB dataset for your own model

10. License

Licensed under the CC BY-NC-SA 4.0 (Attribution-NonCommercial-ShareAlike 4.0 International).

The code is released for academic research use only.

11. Citation

@article{ignatov2020replacing,
  title={Replacing Mobile Camera ISP with a Single Deep Learning Model},
  author={Ignatov, Andrey and Van Gool, Luc and Timofte, Radu},
  journal={arXiv preprint arXiv:2002.05509},
  year={2020}
}

12. Any further questions?

Please contact Andrey Ignatov ([email protected]) for more information

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Replacing Mobile Camera ISP with a Single Deep Learning Model

1. Overview [Paper] [TensorFlow Implementation] [Project Webpage]

2. Prerequisites

3. First steps

4. PyNET CNN

5. Training the model

6. Test the provided pre-trained models on full-resolution RAW image files

7. Test the obtained model on full-resolution RAW image files

8. Folder structure

9. Bonus files

10. License

11. Citation

12. Any further questions?

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
models/original		models/original
raw_images		raw_images
results/full-resolution		results/full-resolution
LICENSE.md		LICENSE.md
README.md		README.md
dng_to_png.py		dng_to_png.py
evaluate_accuracy.py		evaluate_accuracy.py
load_data.py		load_data.py
model.py		model.py
msssim.py		msssim.py
test_model.py		test_model.py
train_model.py		train_model.py
utils.py		utils.py
vgg.py		vgg.py

License

aiff22/PyNET-PyTorch

Folders and files

Latest commit

History

Repository files navigation

Replacing Mobile Camera ISP with a Single Deep Learning Model

1. Overview [Paper] [TensorFlow Implementation] [Project Webpage]

2. Prerequisites

3. First steps

4. PyNET CNN

5. Training the model

6. Test the provided pre-trained models on full-resolution RAW image files

7. Test the obtained model on full-resolution RAW image files

8. Folder structure

9. Bonus files

10. License

11. Citation

12. Any further questions?

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages