Optimized GAN-Based Pipeline For High-Quality Face Restoration From CCTV Images
Optimized GAN-Based Pipeline For High-Quality Face Restoration From CCTV Images
Team:
1. 1MS21AI056 - Tanishka Deep Guide:
2. 1MS21AI015 - Chitransh Srivastava Dr. Meeradevi
3. 1MS21AI011 - Siddharth Bhetariya Associate Professor
4. 1MS21AI054 - Sujal Prakash Singh Artificial Intelligence & Machine Learning
Agenda
➢ Abstract
➢ Introduction
➢ Objectives
➢ Problem Statement
➢ Current Methodology
➢ Present Methodology
➢ Literature Survey/Related Work
➢ Design
➢ System Architecture
➢ Implementation
➢ Results
➢ Conclusion
➢ Reference
16/05/2024 2
Abstract
The proposed deep learning multi-stage approach focuses on managing the inferior quality images. The first stage starts with
noise reduction and image improvement that use ESRGAN to get rid of noise issues and enhance the whole pictures quality.
Face recognition systems typically fail in a case where faces are often subjected to bad quality, blurring or noise making it
difficult to tell whether a face belongs to an individual under review or not and thus the identification and activity analysis
systems might not be effective. These challenges remain difficult for traditional algorithms to work with which result in a loss
of the opportunity of different operations in the fields of law enforcement, security, and surveillance operations, and also in
the decrease in the effectiveness of video surveillance systems. Location, limited angle capture, and non-replication of
identity traits are the major flaws of this method.
The current methodology utilizes multiple methods such as Nearest Neighbour Interpolation, DFDNET and GFPGAN to
enhance facial details in surveillance footage by restoring high-quality facial features from low-resolution images through a
three-step process: Before applying these methods on such images, it must be ensured that preprocessing steps such as noise
removal and artifacts are performed, likewise face restoration would be done using the mentioned methods to enhance details,
and finally the output images are generated so as to come up with clearer and improved images.
Keywords—Facial Reconstruction, Facial Recognition, Enhanced Super Resolution Generative Adversarial Network,
Progressive Growing Generative Adversarial Network, Facial Recognition, Image Enhancement.
16/05/2024 3
Introduction
• The project aims to enhance facial recognition from poor-quality and low resolution
CCTV captured images.
• By utilizing cutting-edge models like StyleGAN and Enhanced Super-Resolution GAN
(ESRGAN), the framework systematically improves image quality and reconstructs facial
features.
• This effort addresses the urgent need for dependable surveillance systems, particularly in
cases where existing facial recognition technologies struggle with low-quality footage.
• The project seeks to set innovative benchmarks in facial recognition technology,
contributing to more reliable and precise surveillance solutions for critical purposes.
16/05/2024 4
Objectives
● Building an novel framework using deep learning models like ESRGAN and StyleGAN for each
facial reconstruction step.
● Enhancing clarity and reconstructing facial representations to generate facial reconstructions that
are significantly more detailed and clearer than the original input.
● Designing and implementing a user-friendly workflow to facilitate easy interaction with the
system.
● Conducting a comparative analysis between the proposed framework and current models to
evaluate the framework's performance against existing facial reconstruction approaches, providing
valuable insights.
16/05/2024 5
Problem Statement
Problem Description:
● Facial recognition systems rely on clear visuals but struggle with poor image quality and pixelated blurring.
● Traditional algorithms fail to adapt, hindering identification and activity analysis in CCTV footage.
● This limitation leads to missed opportunities in law enforcement, security, and surveillance operations, reducing
the effectiveness of video surveillance systems.
Current Methodology:
● Multiple methods can be used to enhance facial details in surveillance footage such as GFPGAN , DFDNET ,
Nearest Neighbour Interpolation restores high-quality facial features from low-resolution images through:
○ Image Preprocessing: Removing noise and artifacts.
○ Face Restoration: Using mentioned methods to restore and enhance details.
○ Output Generation: Producing clearer, improved images.
Limitations: Current methods struggles with severely degraded images and may inconsistently preserve identity details.
16/05/2024 6
Proposed Method
Our proposed method improves facial recognition by using ESRGAN and StyleGAN2 to enhance and reconstruct
faces and generate multiple angles. Steps include:
16/05/2024 7
Methodology
● Stage 1: Denoising and Enhancement - Elevating the quality of CCTV images, suffering from
low resolution, distortion, and noise by employing the Enhanced Super-Resolution GAN
(ESRGAN) model.
16/05/2024 8
Methodology
● Stage 2: Face Reconstruction - Utilize the advanced capabilities of StyleGAN2 for facial reconstruction.
StyleGAN's sophisticated control over facial attributes allows for accurate reconstruction or refinement of
specific features within enhanced frames.
Li, Wenbo, et al. "Best-buddy gans for Best-buddy GANs Enhanced super-resolution performance Novel approach with GANs
highly detailed image for highly detailed images. targeting image details.
super-resolution." Proceedings of the
AAAI Conference on Artificial
Intelligence. Vol. 36. No. 2. 2022.
Shi, Yue, et al. "A latent encoder Latent Encoder Coupled GAN (LE-GAN) Efficient hyperspectral image Improved performance on complex
coupled generative adversarial network super-resolution. hyperspectral data.
(le-gan) for efficient hyperspectral
image super-resolution." IEEE
Transactions on Geoscience and
Remote Sensing 60 (2022): 1-19.
Li, Maomao, et al. "E4S: Fine-grained Fine-grained Face Swapping with Advanced accuracy in fine-grained face Focuses on high-detail and regional
Face Swapping via Editing With Regional GAN Inversion (E4S) swapping. accuracy.
Regional GAN Inversion." arXiv
preprint arXiv:2310.15081 (2023).
Shen, Ziyi, et al. "Exploiting semantics Semantic exploitation for deblurring Improved face image clarity in Uses semantic understanding to
for face image deblurring." deblurring tasks. enhance results.
International Journal of Computer
Vision 128.7 (2020): 1829-1846.
16/05/2024 10
Title, Author and Year Technique used Result Remarks
Abdal, Rameen, Yipeng Qin, and Peter Embedding images into StyleGAN Demonstrated successful embedding for Breakthrough in image manipulation
Wonka. "Image2stylegan: How to embed latent space manipulation. using StyleGAN.
images into the stylegan latent space?."
Proceedings of the IEEE/CVF
international conference on computer
vision. 2020.
Lu, Wanglong, et al. "Do Inpainting Generative Facial Inpainting with Improved results in facial inpainting Exemplar use enhances the inpainting
Yourself: Generative Facial Inpainting Exemplar Guidance tasks. quality.
Guided by Exemplars." arXiv preprint
arXiv:2202.06358 (2022).
Zhou, Shangchen, et al. "Towards robust Codebook Lookup Transformer for Increased robustness in blind face Novel transformer-based approach for
blind face restoration with codebook blind face restoration restoration. restoration.
lookup transformer." Advances in Neural
Information Processing Systems 35
(2022): 30599-30611.
Wang, Xintao, et al. "Towards real-world Generative Facial Prior for Enhanced face restoration in practical Combines generative models with
blind face restoration with generative Real-World Blind Face Restoration applications. practical needs.
facial prior." Proceedings of the
IEEE/CVF conference on computer vision
and pattern recognition. 2021.
16/05/2024 11
Title, Author and Year Technique used Result Remarks
Lee, Dongyeun, et al. "Fix the noise: Disentangling source features in Effective noise reduction in Addresses specific noise issues in
Disentangling source feature for transfer StyleGAN for noise reduction StyleGAN-based transfer learning. transfer learning.
learning of StyleGAN." arXiv preprint
arXiv:2204.14079 (2022).
Jiang, Junjun, et al. "Deep learning-based Survey on deep learning for face Comprehensive overview and analysis Valuable insights into trends and
face super-resolution: A survey." ACM super-resolution of current techniques. challenges in the field.
Computing Surveys (CSUR) 55.1 (2021):
1-36.
Demiray, Bekir Z., Muhammed Sit, and D-SRGAN for DEM Enhanced resolution in digital elevation Applies GANs to a novel area of
Ibrahim Demir. "D-SRGAN: DEM super-resolution models (DEM). geographical imaging.
super-resolution with generative
adversarial networks." SN Computer
Science 2 (2021): 1-11.
Karras, Tero, Samuli Laine, and Timo Style-based generator architecture in Improved generative performance and Influential in the evolution of GAN
Aila. "A style-based generator architecture GANs flexibility. architectures.
for generative adversarial networks."
Proceedings of the IEEE/CVF conference
on computer vision and pattern
recognition. 2019.
16/05/2024 12
Title, Author and Year Technique used Result Remarks
Mathai, Joe, Iacopo Masi, and Wael Generative face completion for Positive impact on face recognition Explores synergy between generation
AbdAlmageed. "Does generative face enhanced face recognition accuracy. and recognition tasks.
completion help face recognition?." 2019
International Conference on Biometrics
(ICB). IEEE, 2019.
Kalarot, Ratheesh, Tao Li, and Fatih Component Attention Guided Improved super-resolution focusing on Enhanced attention mechanism
Porikli. "Component attention guided face Network (CAGFace) for face facial components. improves detail retrieval.
super-resolution network: Cagface." super-resolution
Proceedings of the IEEE/CVF winter
conference on applications of computer
vision. 2020.
Brock, Andrew, Jeff Donahue, and Karen Large scale training of GANs for Achieved high fidelity in natural image Pioneering work on scaling up GAN
Simonyan. "Large scale GAN training for high fidelity image synthesis synthesis. training.
high fidelity natural image synthesis."
arXiv preprint arXiv:1809.11096 (2020).
16/05/2024 13
Design
Fig. 1: Design
16/05/2024 14
Input
16/05/2024 16
3. Converting to Latent Vector Space with HyperStyle:
● Input: High-resolution image from ESRGAN.
● Image Embedding:
● Utilize HyperStyle to embed the high-resolution image into StyleGAN's latent space.
● Latent Vector Generation:
● HyperStyle processes the image to find the corresponding style vectors mapped to the input image.
● Output: Style vectors representing the facial features and attributes of the upscaled image.
16/05/2024 17
Implementation
Dataset Description:
The dataset used for training the ESRGAN consists of high-quality, high-resolution images that serve as the ground
truth for training and testing.
Datasets Used:
• DIV2K Dataset: High-quality, high-resolution images used for training the ESRGAN model.
• Synthetic Face Dataset: Synthetically generated dataset used to enhance the training process and improve
model robustness.
Super-Resolution Enhancement:
16/05/2024 19
Pseudo code
Function PreprocessImage(image_path): Function EmbedImageIntoStyleGAN2WithHyperStyle(high_res_image):
image = ReadImage(image_path) w = SampleLatentVector()
preprocessed_image = BasicPreprocessing(image) optimized_w = HyperStyleOptimize(w, high_res_image)
Return preprocessed_image Return optimized_w
Function InitializeESRGAN():
model = LoadESRGANModel() Function MainWorkflow(image_path):
LoadPretrainedWeights(model, weights_path) # Step 1: Upscaling with ESRGAN
Return model
preprocessed_image = PreprocessImage(image_path)
Function UpscaleImage(model, image): esrgan_model = InitializeESRGAN()
high_res_image = model.Upscale(image) high_res_image = UpscaleImage(esrgan_model, preprocessed_image)
Return high_res_image
# Step 2: Embedding into StyleGAN2 Latent Space
Function SampleLatentVector():
z = SampleFromGaussian() w = EmbedImageIntoStyleGAN2WithHyperStyle(high_res_image)
w = TransformToIntermediateLatentSpace(z) final_image = ManipulateLatentVector(w, attributes_to_change)
Return w Return final_image
16/05/2024 20
Results
Low resolution
input image
Fig. 4: Results
16/05/2024 21
Results
Fig. 5: Results
16/05/2024 22
Results
Fig. 6: Results
16/05/2024 23
Results
Proposed Model
Fig. 7: Results
16/05/2024 24
Results
Fig. 8: SSIM for Super-Resolved Images Fig. 9: PSNR for Super-Resolved Images
16/05/2024 26
References
[1]Li, Wenbo, et al. "Best-buddy gans for highly detailed image super-resolution." Proceedings of the AAAI Conference on
Artificial Intelligence. Vol. 36. No. 2. 2022.
[2]Shi, Yue, et al. "A latent encoder coupled generative adversarial network (le-gan) for efficient hyperspectral image
super-resolution." IEEE Transactions on Geoscience and Remote Sensing 60 (2022): 1-19.
[3]Li, Maomao, et al. "E4S: Fine-grained Face Swapping via Editing With Regional GAN Inversion." arXiv preprint
arXiv:2310.15081 (2023).
[4]Shen, Ziyi, et al. "Exploiting semantics for face image deblurring." International Journal of Computer Vision 128.7 (2020):
1829-1846.
[5]Abdal, Rameen, Yipeng Qin, and Peter Wonka. "Image2stylegan: How to embed images into the stylegan latent space?."
Proceedings of the IEEE/CVF international conference on computer vision. 2020.
[6]Lu, Wanglong, et al. "Do Inpainting Yourself: Generative Facial Inpainting Guided by Exemplars." arXiv preprint
arXiv:2202.06358 (2022).
[7]Zhou, Shangchen, et al. "Towards robust blind face restoration with codebook lookup transformer." Advances in Neural
Information Processing Systems 35 (2022): 30599-30611.
16/05/2024 27
[8]Wang, Xintao, et al. "Towards real-world blind face restoration with generative facial prior." Proceedings of the
IEEE/CVF conference on computer vision and pattern recognition. 2021.
[9]Lee, Dongyeun, et al. "Fix the noise: Disentangling source feature for transfer learning of StyleGAN." arXiv preprint
arXiv:2204.14079 (2022).
[10]Jiang, Junjun, et al. "Deep learning-based face super-resolution: A survey." ACM Computing Surveys (CSUR) 55.1
(2021): 1-36.
[11]Demiray, Bekir Z., Muhammed Sit, and Ibrahim Demir. "D-SRGAN: DEM super-resolution with generative adversarial
networks." SN Computer Science 2 (2021): 1-11.
[12] Karras, Tero, Samuli Laine, and Timo Aila. "A style-based generator architecture for generative adversarial networks."
Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 2019.
[13]Mathai, Joe, Iacopo Masi, and Wael AbdAlmageed. "Does generative face completion help face recognition?." 2019
International Conference on Biometrics (ICB). IEEE, 2019.
[14]Kalarot, Ratheesh, Tao Li, and Fatih Porikli. "Component attention guided face super-resolution network: Cagface."
Proceedings of the IEEE/CVF winter conference on applications of computer vision. 2020.
[15]Brock, Andrew, Jeff Donahue, and Karen Simonyan. "Large scale GAN training for high fidelity natural image
synthesis." arXiv preprint arXiv:1809.11096 (2020).
16/05/2024 28
Thank You
16/05/2024 29