Enhancing the Performance of Generative Adversarial Networks with Identity Blocks and Revised Loss Function to Improve Training Stability

Document Type : Original Article

Authors

1 Department of Computer Science, Faculty of Computers and Information, Kafr El-Sheikh University, Kafr El-Sheikh 33511, Egypt

2 Department of Computer Science, Faculty of Computers and Information, Menoufia University, Menoufia 32511, Egypt

Abstract

Generative adversarial networks (GANs) are a powerful deep learning model for synthesizing realistic images; however, they can be difficult to train and are prone to instability and mode collapse. This paper presents a modified deep learning model called Identity Generative Adversarial Network (IGAN) to address the challenges of training and instability faced by generative adversarial models in synthesizing realistic images. The IGAN model includes three modifications to improve the performance of DCGAN: a non-linear identity block to ease complex data fitting and reduce training time; a modified loss function with label smoothing to smooth the standard GAN loss function; and minibatch training to use other examples from the same minibatch as side information for better quality and variety of generated images. The effectiveness of IGAN was evaluated and compared with other state-of-the-art generative models using the inception score (IS) and Fréchet inception distance (FID) on CelebA and stacked MNIST datasets. The experiments demonstrated that IGAN outperformed the other models in terms of convergence speed, stability, and diversity of results. Specifically, in 200 epochs, IGAN achieved an IS of 13.6 and an FID of 46.2. Furthermore, the IGAN collapsed modes were compared with other generative models using a stacked MNIST dataset, showing the superiority of IGAN in producing all the modes while the other models failed to do so. These results demonstrate that the modifications implemented in IGAN can significantly enhance the performance of GANs in synthesizing realistic images, providing a more stable, high-quality, and diverse output.

Keywords