What is a Generative Adversarial Network?

by Pelican Press
5 minutes read

What is a Generative Adversarial Network?

A Generative Adversarial Network (GAN) is a type of machine learning model that’s used to generate fake data that resembles real data.

Since its inception in 2014 with Ian Goodfellow’s ‘Generative Adversarial Networks’ paper, progress with GANs has exploded and led to increasingly realistic output that resembles the training dataset.

GANs have several applications across different industries. These days they are used to create all kinds of content including text, images, audio, and even video. They can do this either through text-based prompts, or by modifying existing content. You can sample some of their creations through projects like thispersondoesnotexist.com, which generates photos of random human faces that don’t belong to any real person.

In addition to creation, GAN can also be used to edit images. They can, for instance, help convert a low-resolution image to a high-resolution one or add color to a black-and-white image.

“A big topic in Artificial Intelligence (AI) these days is synthetic data, which is fake data that can be used to train AI models when you don’t have enough real data,” says Stefan Leichenauer, VP of Engineering, SandboxAQ. He says a GAN can be used to create a set of synthetic data that resembles the real data, which can then be used to train another AI model.

Key components of a GAN

A GAN is called adversarial because it trains two different neural networks and pits them against each other in a zero-sum game. One network generates new data by taking an input data sample and modifying it as much as possible. The other network tries to predict whether the generated data output belongs to the original dataset.

The two neural networks that make up a GAN are referred to as the generator and the discriminator. Depending on their use, one of these is typically implemented as a convolutional neural network (CNN), and the other as a deconvolutional neural network (DNN) that functions in a reversed process as that of a CNN.

Generator: This network takes an input, and generates new data samples that attempt to mimic the training data.

Generators act as the creative force behind the GAN. It takes a random collection of numbers that initially hold no meaning. Through its internal layers, the generator transforms this into data that resembles the kind of input it has been trained on. This could be a realistic image, a snippet of music, or even a piece of text.

Experts often think of the generator as a talented artist who’s constantly experimenting with different combinations to create something new.

Discriminator: This network evaluates the generated data samples and predicts whether they are real or fake, based on their similarity to the training data.

This network plays a role similar to that of a discerning critic. It receives two types of data as input; one is the real data samples, like real images of birds, and the data generated by the generator. The discriminator’s job is to analyze both and determine whether they are real or fake.

If the generator is the artist, think of the discriminator as a seasoned art expert, that’s scrutinizing every detail to distinguish genuine creations from cleverly crafted forgeries.

What is a Generative Adversarial Network?

(Image credit: Old Source)

How do GANs work?

Leichenauer explains that the generator and discriminator are trained in tandem and both improve in a kind of arms race against each other.

The generator produces samples, and the discriminator evaluates them. The generator adjusts its output to produce samples that are more likely to fool the discriminator, while the discriminator becomes more skilled at distinguishing between real and synthetic samples.

“In its quest to fool the discriminator, the generator very quickly learns to create data that is very challenging for a human to distinguish from real data,” says Leichenauer.

While the generator is trained to produce false data, the discriminator network is taught to distinguish between the generator’s synthetic data and the real examples. If the discriminator can quickly recognize the fake data that the generator produces, the generator suffers a penalty.

Another crucial aspect of GANs is a technique called backpropagation. This is the process that powers how the generator learns from the discriminator’s feedback. Backpropagation essentially allows the errors identified by the discriminator to be propagated backward through the generator’s layers. Based on these errors, the weights and biases in the generator’s network are adjusted. This in turn helps the generator produce more realistic data points in the next iteration.

As the feedback loop between the adversarial networks continues, the generator begins to produce higher-quality and more believable output, and the discriminator becomes better at flagging data that has been artificially created.

The training process is over once the discriminator can no longer recognize synthesized data.

Types of GANs

GANs come in many forms and can be used for various tasks, based on how the generator and discriminator interact with each other. A Vanilla GAN is the simplest form of a GAN.

Then there’s the Conditional GAN (cGAN). Explaining its use, Leichenauer says that if you, for instance, want to create a model that could generate pictures of cats you could use a GAN. Similarly, if you wanted to generate pictures of dogs you could use a second GAN.

Or you could have a single model that’s capable of doing both using a cGAN, which can accept a label (“cat” or “dog”) as part of the input and use that when it generates the image.

Another type of GAN Leichenauer talks about is the CycleGAN, which learns how to change one type of data into another. He says a CycleGAN, for instance, might learn how to turn a photograph into a pencil drawing and vice-versa.

Two other types of GANs that are useful for image generation include deep convolutional GANs that use a deep convolutional neural network to generate images, and super-resolution GANs that focus on upscaling low-resolution images to high resolution.

Researchers have also crafted various kinds of GAN architectures to generate music that captures the essence of human-like compositions. There are also GANs that can mimic human movement and behavior and are used for producing video, and nefariously deepfakes.



Source link

#Generative #Adversarial #Network

You may also like