GauGAN, peanutsGAN, and DataPaint
A few months back, NVIDIA showed off a product called GauGAN. Using a pair of neural networks, (called general adversarial networks, or GAN) they trained a system to recognize 100’s of thousands of images of outdoor pictures. Armed with this network, and trained with the enormous data set, they created a Graphical User Interface, or GUI. Users could create “segmentation maps” that allow anyone to simply sketch out the merest indication of the landscape and have it recreated with the AI’s best guess.
As shown in the video below, it’s guesses we’re pretty damn good.
This is a strong indication of the direction the artistic use case for machine learning will go. It’s going to be like wielding a super pen. We will be able the indicate where the tree goes, but the look and style of said tree will probably be generated on the spot.
Very soon, anyone who can squiggle a few marks (and have access to an enormous visual data set) will make near replicas of some of our greatest visual works. Here’s a couple of fun ideas I had.
peanutsGAN
According to Wikipedia:
Charles M. Schulz created a total of 17,897 Peanuts strips of which there are 15,391 daily strips and 2,506 Sunday strips.
That’s a lot of drawings of our favorite bald block head that could theoretically be fed to a general adversarial network.
The average Disney film is about 80 minutes, at 24 frames per second, that’s 1920 drawings. There are at least 10 hand drawn animated classics that I can think of, plus weeks worth of shorts, books, and loads and loads of marketing materials. If someone were looking to extract the style of a Disney film, there is plenty to train on.
Where’s the DataPaint?
If these nets are coming with GUI’s to allow artists to wield them, then it begs the question.
Where can we get, non-licensed, large data sets of imagery?
I typed in “Batman” into google images. With the millions of “like feature” results that came back, I’m sure a network can learn exactly what he looks like. But is this legal, according to our current laws of copyright?
Will our aggregate online imagery eventually be scraped and squished into gigantic mathematical systems? And in the short term, what’s going to happen when networks like this get into Photoshop? Or Unity?
As news begins to trickle in on the merger of machine learning and animation graphics, I will be sure to keep tabs on the development.
Please consider subscribing if this is of interest to you. I welcome thoughts and feedback.
Thanks for reading, we’ll see you next week.
Links & Reference:
StyleGANS explained and NVIDIA’s novel architecture for Generative Adversarial Networks at Towards Data Science:
Google QuickDraw: https://quickdraw.withgoogle.com/
Nvidia AI turns sketches into photorealistic Landscapes in Seconds:
Search Google Images for Batman: