Deep Convolutional Generative Adversarial Networks

Deep Convolutional Generative Adversarial Networks — DCGAN is one of the first algorithms towards the Generative AI on Image data. In this article, we will break down the steps involved and provide a clear explanation of the algorithm.

Introduction

Imagine that you want to learn how to write letters used in a new language from scratch but there is no source to learn from. It’s definitely impossible to learn then, no?

What if there is a teacher who provides you continuous feedback. You make some really random drawing and teacher provides you feedback depending on your drawing so that it gets you closer to the actual letter without exactly telling you what to draw. Then you draw letters again but this time utilizing the information provided to you by the teacher and using information in your initial drawing. You will surely improve albeit improvement pace will depend on various factors. This cycle goes on and eventually, you will get to a point where teacher would not have any feedback left because by then you would have learnt how to draw letters.

Generative Adversarial Networks work on a very similar principle where there is a generator which is generating images and a discriminator which is discriminating between the real and fake images. In example above, think of the fakes being generated by you while the actual letters are the real ones. If teacher is no more able to differentiate between the fake letters (drawn by you) and real letters (actual letters), then you have learnt how to draw letters.

Dataset

MNIST data is one of the go-to datasets for deep learning problems due to its simplicity and we will be utilizing it in this explanation.

In above code snippet, we are doing the following:

  1. Importing all the libraries
  2. loading the MNIST data
  3. rescaling the images, shuffling and creating batches for the training the DCGAN model.

Third step is very important to optimise our training.

Generator

# Generate the generator model using above function
generator = make_generator_model()

# Create random noise by sampling from a
# normal distribution with a shape of [1, 100].
noise = tf.random.normal([1, 100])

# Generate an image using the generator model by passing
# the random noise as input.
# Set the training mode to False to ensure that the generator
# doesn't update its parameters during this generation.
generated_image = generator(noise, training=False)

# Display the generated image using Matplotlib.
# It's a grayscale image, so we specify the colormap as 'gray'.
plt.imshow(generated_image[0, :, :, 0], cmap='gray')

Generator model starts with a random input and it generates an image of 28x28 size. However, note that parameters need to be optimised to get a desired result. For example, the layers.Dense & layers.Conv2DTranspose steps define the tensors which contain the required parameters. Desired result from our generator model in this case would be MNIST quality integer images.

For this optimisation, we need to utilise the discriminator model. Discriminator takes an image of size 28x28 as an input and results in a probability of it being an integer. As you might have guessed, from getting to an image to probability of it being an integer, we again need parameters and optimisation of these parameters.

As we will see in next step both these trainings run in parallel for the epochs defined by us.

Discriminator

# decision making step 
discriminator = make_discriminator_model()
decision = discriminator(generated_image)
print (decision)

Optimisers & Loss functions for Generator and Discriminator training

Before we proceed with training our generator and discriminator, we need to define loss functions which will determine how well our generator and discriminator is performing in their respective tasks. For the sake of clarity:

  1. Discriminator’s loss function uses both MNIST images and images created by generator model in training data.
  2. Generator’s loss function is purely based on the images created by it.

In both the cases, we are trying to solve a classification problem hence we are using cross entropy loss function. We utilise this loss function and gradient descent to update the parameters during the training process.

Training

In each training step:

  1. We first calculate the loss function for the images.
  2. This is followed by calculating the gradient of the loss function with respect to parameters in the model.
  3. Next step is to use the optimiser to update parameters using the gradient information calculated in previous step.
  4. We perform training for each of the defined batches and for the number of epochs defined so total no of times training step will run = epochs * number of batches in dataset.
  5. For the illustration purposes, we also save the images in the intermediate steps which shows how our generator is getting better.

We execute the training and save the gif of images created in each step for illustrating the progress.

Conclusion

In the article above, we went through a simple tutorial implementing deep convolutional generative adversarial networks which uses generator and discriminator models to provide continuous feedback and improvments in generator. This algorithm is one of the first among the GAN models and provide a good insight into how generative AI on image data works.

References

  1. https://www.tensorflow.org/tutorials/generative/dcgan
  2. https://towardsai.net/p/l/generative-ai-gans
  3. https://en.wikipedia.org/wiki/MNIST_database
  4. https://arxiv.org/pdf/1511.06434.pdf

If you found the explanation helpful, follow me for more content! Feel free to leave comments with any questions or suggestions you might have.

You can also check out other articles written around data science, computing on medium. If you like my work and want to contribute to my journey, you cal always buy me a coffee :)

Comments