Skip to content

Latest commit

 

History

History
94 lines (58 loc) · 3.3 KB

README.md

File metadata and controls

94 lines (58 loc) · 3.3 KB

Structure-aware Layout Generation

Team 9: Bekzat Tilekbay, Shyngys Aitkazinov

Installation

  1. Clone this repository

    git clone https://github.com/fesiib/cs492i-layout-generation.git
    cd cs492i-layout-generation
  2. Create a new conda environment (Python 3.8)

    conda env create -f environment.yml
    conda activate layout-generation
  3. Change the directories appropriately in train and test files in each src We assume that all the pretrained models are in results folder. Avoid using prefix trial_, as it might get deleted while training

Development environment

  • Ubuntu 18.04, CUDA 11.3

Test

Access one of src_* and run test.ipynb

Train

Let SRC be one of src_lstm, src_transformer and TRAIN be one of train_*.py

python SRC/train TRAIN

Checkpoints with metavariables will be saved in folder ./results

Models

Models Epochs Link Comments
LSTM-GAN 329 Drive
Transformer-GAN 249 Drive Requires LayoutGAN++
Transformer-MSE 249 Drive
LayoutGAN++ 499 Drive

Transformer-GAN is adapted LayouGAN++[3] and uses pretrained frozen LayoutGAN++ that we provide above.

Dataset

Dataset is located in ./data/bbs/ in .csv format.

Was Generated from DOC2PPT[1] Dataset with FitVid layout detection (fine-tuned CenterNet[2]) model.

The structure is as follows:

Slide Deck Id,Slide Id,Image Height,Image Width,Type,X,Y,BB Width,BB Height

Results

Quantitative Results

Models mIOU Accuracy (MSE) Overlap
LSTM-GAN 0.0304 0.0352 0.3579
Transformer-GAN 0.0098 0.2422 1.4003
Transformer-MSE 0.0798 0.0151 1.0448

Overlap in the actual dataset: 0.1700.

Qualitative Results

transformer-mse transformer-gan lstm

References

[1] DOC2PPT: Automatic Presentation Slides Generation from Scientific Documents, Tsu-Jui Fu, William Yang Wang, Daniel McDuff, Yale Song, 2021

[2] CenterNet: Keypoint Triplets for Object Detection, Kaiwen Duan, Song Bai, Lingxi Xie, Honggang Qi, Qingming Huang, Qi Tian, 2019

[3] Constrained Graphic Layout Generation via Latent Optimization, Kotaro Kikuchi, Edgar Simo-Serra, Mayu Otani, Kota Yamaguchi, 2021