RGDiffSR

The official pytorch implementation of Paper: RECOGNITION-GUIDED DIFFUSION MODEL FOR SCENE TEXT IMAGE SUPER-RESOLUTION

Installation

Environment preparation: (Python 3.8 + PyTorch 1.7.0 + Torchvision 0.8.1 + pytorch_lightning 1.5.10 + CUDA 11.0)

conda create -n RGDiffSR python=3.8
git clone [email protected]:shercoo/RGDiffSR.git
cd RGDiffSR
pip install -r requirements.txt

You can also refer to taming-transformers for the installation of taming-transformers library (Needed if VQGAN is applied).

Dataset preparation

Download the TextZoom dataset at TextZoom.

Model checkpoints

Download the pre-trained recognizers Aster, Moran, CRNN.

Download the checkpoints of pre-trained VQGAN and RGDiffSR at Baidu Netdisk. Password: yws3

Training

First train the latent encoder (VQGAN) model.

CUDA_VISIBLE_DEVICES=<GPU_IDs> python main.py -b configs/autoencoder/vqgan_2x.yaml -t --gpus <GPU_IDS>

Put the pre-trained VQGAN model in checkpoints/.

CUDA_VISIBLE_DEVICES=<GPU_IDs> python main.py -b configs/latent-diffusion/sr_best.yaml -t --gpus <GPU_IDS>

Testing

Put the pre-trained RGDiffSR model in checkpoints/.

CUDA_VISIBLE_DEVICES=<GPU_IDs> python test.py -b configs/latent-diffusion/sr_test.yaml  --gpus <GPU_IDS>

You can manually modify the test dataset directory in sr_test.yaml for test on different difficulty of TextZoom dataset.

License

The model is licensed under the MIT license.

Acknowledgement

Our code is built on the latent-diffusion and TATT repositories. Thanks to their research!

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
configs		configs
ldm		ldm
text_super_resolution		text_super_resolution
utils		utils
LICENSE		LICENSE
README.md		README.md
RGDiffSR.png		RGDiffSR.png
al_chinese.txt		al_chinese.txt
main.py		main.py
requirements.txt		requirements.txt
setup.py		setup.py
test.py		test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RGDiffSR

Installation

Dataset preparation

Model checkpoints

Training

Testing

License

Acknowledgement

About

Releases

Packages

Languages

License

shercoo/RGDiffSR

Folders and files

Latest commit

History

Repository files navigation

RGDiffSR

Installation

Dataset preparation

Model checkpoints

Training

Testing

License

Acknowledgement

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages