Knowledge Vault 5 /29 - CVPR 2017
Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network
Christian Ledig, Lucas Theis, Ferenc Huszár, Jose Caballero, Andrew Cunningham, Alejandro Acosta, Andrew Aitken, Alykhan Tejani, Johannes Totz, Zehan Wang, Wenzhe Shi
< Resume Image >

Concept Graph & Resume using Claude 3 Opus | Chat GPT4o | Llama 3:

graph LR classDef superresolution fill:#f9d4d4, font-weight:bold, font-size:14px classDef learning fill:#d4f9d4, font-weight:bold, font-size:14px classDef applications fill:#d4d4f9, font-weight:bold, font-size:14px classDef srresnet fill:#f9f9d4, font-weight:bold, font-size:14px classDef srgan fill:#f9d4f9, font-weight:bold, font-size:14px classDef evaluation fill:#d4f9f9, font-weight:bold, font-size:14px classDef limitations fill:#f9d4d4, font-weight:bold, font-size:14px classDef future fill:#d4f9d4, font-weight:bold, font-size:14px A[Photo-Realistic Single Image
Super-Resolution Using a
Generative Adversarial Network] --> B[Photo-realistic single image super-resolution 1] B --> C[Increase resolution, add texture 2] B --> D[Applications: satellite, media, medical, surveillance 3] A --> E[SRResNet: deep residual CNN 4] E --> F[Trained on ImageNet 5] E --> G[Deeper network, residual blocks, skip connections 6] E --> H[Efficient sub-pixel convolution upscaling 7] E --> I[More residual blocks improve PSNR 8] E --> J[Improves bicubic but lacks perceptual quality 9] E --> K[Regression to mean with MSE loss 10] A --> L[SRGAN: GAN-based approach 11] L --> M[Related work: feature, adversarial losses 12] L --> N[Perceptual loss: MSE in VGG space 13] L --> O[Adversarial loss: discriminator distinguishes real/fake 14] O --> P[Generator fools discriminator, reconstructs details 15] O --> Q[Discriminator based on modified VGG 16] O --> R[Minimax optimization of cross-entropy 17] L --> S[Adversarial loss preserves manifold 18] L --> T[Content loss allows texture freedom 19] L --> U[SRGAN adds texture, convincing results 20] A --> V[Evaluation using MOS test 21] V --> W[SRGAN outperforms in MOS 22] V --> X[SRResNet excels in PSNR, SRGAN in perception 23] V --> Y[SRGAN performs well at higher scales 24] A --> Z[Limitations: text, numbers difficult 25] Z --> AA[Importance of diverse training data 26] A --> AB[Interest in improved GAN training 27] A --> AC[Need for perceptual quality metrics 28] A --> AD[Acknowledgments to co-authors 29] A --> AE[Invitation to poster session 30] class A,B,C superresolution class E,F,G,H,I,J,K srresnet class L,M,N,O,P,Q,R,S,T,U srgan class D applications class V,W,X,Y evaluation class Z,AA limitations class AB,AC future

Resume:

1.- Photo-realistic single image super-resolution using deep learning

2.- Increase spatial resolution and add fine texture detail

3.- Applications in satellite imaging, media content, medical imaging, surveillance

4.- SRResNet: deep residual CNN optimized for PSNR

5.- Trained on 350,000 ImageNet images

6.- Deeper network with identical residual blocks and skip connections

7.- Efficient sub-pixel convolution for upscaling

8.- More residual blocks improve PSNR

9.- SRResNet improves upon bicubic interpolation but lacks perceptual quality

10.- Regression to the mean problem with MSE loss

11.- SRGAN: GAN-based approach to overcome limitations of MSE

12.- Related work: feature space losses and adversarial losses

13.- Perceptual loss functions: MSE in VGG feature space

14.- Adversarial loss: discriminator network to distinguish real/fake images

15.- Generator trained to fool discriminator by reconstructing realistic details

16.- Discriminator architecture based on VGG with modifications

17.- Minimax optimization of cross-entropy loss

18.- Adversarial loss pulls reconstructions back to natural image manifold

19.- Content loss in feature space allows more freedom for texture details

20.- SRGAN adds fine texture details, perceptually convincing results

21.- Evaluation using Mean Opinion Score (MOS) test with human raters

22.- SRGAN outperforms SRResNet and reference methods in MOS

23.- SRResNet excels in PSNR, but SRGAN provides superior perceptual quality

24.- SRGAN performs well for higher upscaling factors (8x, 16x)

25.- Limitations: Difficulty reconstructing text and numbers

26.- Importance of training data diversity

27.- Interest in improved GAN training techniques

28.- Need for better objective functions capturing perceptual quality

29.- Acknowledgments to co-authors, particularly Wenze Shi

30.- Invitation to poster session for further discussion

Knowledge Vault built byDavid Vivancos 2024