- Image-to-Image Translation: Methods and Applications (2021)
[paper]
-
IS (2016)
Proposed as a evaluation metric for GAN. Ideal generative model should have both high distinctivity and diversity. IS score is calcuated as KL divergence between conditional label distribution and marginal label distribution. IS do not penalize in-calss-diversity. Advised to use samples at least 50k.
[paper] -
FID (2017)
Proposed as replacement of IS score to evaluate generative model. Compare statistics of Inception_v3's activation statistics. Advised to use samples more than 10k for stability.
[paper] [github]
Tradeoff between IS and FID? paper ~ truncation
IS, FID doubt paper ~ memorization problem
looks like there is no perfect metric for generative model now.
-
Precision/Recall(2018/2019) Proposed to completment of IS and FID. Both just evaluate generative model as single scalar, which is not perfect for measuring 'Fidelity' and 'Diversity'.
[paper] [paper2] -
Density and Coverage(2020)
Measures the distance between real images and generated images by introducing a manifold estimation procedure.
[paper] -
LPIPS (2018)
Proposed as assessing similarity of two images by inner embedding of DNN. [paper] [github]
-
Open Images
Open Images dataset includes 9M images annotated with 36M image-level labels, 15.8M bounding boxes, 2.8M instance segmentations, and 391k visual relationships. -
Places The Places dataset is proposed for scene recognition and contains more than 2.5 million images covering more than 205 scene categories with more than 5,000 images per category.