Design patterns for character animation are about to get really complex
Story Telling Moments
Character animation is hard. And with real time systems, it’s going to get a lot harder.
In order to plan through the creation of a character’s performance, an animator uses design strategies to construct the motion. Since the days of Walt Disney and his nine old men, animators have relied on a method of quantifying the actions of characters into story telling moments. These moments are often referred to as “Keys.”
By drawing a handful of story moments and then “popping” between them, the animator can explore the timing and readability of the shot. Below is an example from Richard William’s Animator Survival Kit, showing the key drawings of a character walking to a chalkboard and starting to write. The story of the performance can be conveyed in three simple moments.
It’s sometimes very difficult to determine these keys. Realizing this difficulty and then the energy required to flesh out the action, it’s astounding that we only use the sequence of frames once. Entire movies (which are massive undertakings) are animated once, and then thrown away! The energy the animator puts in is equivalent to the visible experience they get out of it.
This linear output looks something like this:
While the art form in this sequential logic form is beautiful, it is highly inefficient.
In a real time system, such as that in an engine, the character’s actions are reusable. These actions can be changed based on the dynamic nature of the environment. Simply thinking of a character in terms of linear keys is too limiting. We need a way to quantify a character performance in a way beyond it’s single use.
Finite State Machines
A state machine is a mathematical design pattern where an entity exists in bracketed conceptual moments, called “states.” States are an architecture that allow for a predetermined series of actions to be triggered, provided conditions are met.
For example, a character entity in an engine may be in a state of “walking” until it is confronted with a street to cross, to which it will change it’s state to “waitfor the light.” States aren’t just visibly physical, like walking or jumping. Characters can be in a state of hunger, or in a state of anger, or in a state of existential crisis. When you begin to imagine states for characters, you start to understand how a character performs in a way outside of linear time. Designing keys in this mindset might look something more like this:
Video games already do this in a limited capacity to satisfy the requirements of a character’s action during game play. A character will begin in an “idle state” and when the user input commands it to run left, it will change it’s state to “run left.” By changing it’s state, the engine knows to play an animation clip of the character running to the left.
Increasingly, game engines are providing UI systems that allow for the development and design of character state machines. The example below is taken from Unity’s state machine which allows you to import animation clips and arrange them into a pattern that triggers at run time.
My hunch is that, as real time systems become more and more integral to the animation production process, character work will increasingly become reliant on the development of complex state machines. These massive state machines will not only drive the actions of the character, but the motivational nature of them as well.
Thanks for Reading. See you next week!
Here are some references to keep you going on Animation Keys and State Machines.
I began this newsletter to begin a conversation with the computer graphics industry. Should you have thoughts or comments, please feel free to reach out. I can be found on twitter @nyewarburton.
Real Time animation production should start with a change in mindset
Computer graphics can make gorgeous, high resolution stuff. However, that isn’t always the point.
The graphics of the game industry evolved on a parallel track to the 3d techniques of the movie business. Instead of focusing entirely on high resolution images for the screen, they focused on reusing things, packing them, and limiting the color palettes. In order to play in real time, the content needed to be optimized. Because of this focus, games have always been looked at as less graphically impressive.
That’s because most in the movie industry don’t understand the real art of game design.
Instead of thinking:
“How do I make this really high quality?”
Start thinking:
“What’s the most efficient thing I can build to get the most use out of it?”
A Space Chicken Showed me the Way
In 2012, I was an animator, but a novice game designer. After three failed attempts at building a mobile game, I decided to simplify my learning process and rip off what everyone else was doing at that time — build an endless runner.
Roping in some development help, the result was Commander cluck, a demo of a running and jumping space chicken, that had a single touch mechanic. This was a triumph for me as my first game that actually worked. What threw me about the development process however, was learning and seeing the potential of something called procedural generation.
I had started to write unique levels out but, after watching the talks and readings from the independent gaming world, I decided to try something new.
I divided a level into seven “chunks” of content. I made a single level with seven variable chunks. I made four different background sets of pieces. Then, I tied the content variables and the speed into the performance of the player. At run time, the chunks were randomly selected and placed based on the changing variables.
The result was a game that generated it’s levels and dynamically adjusted the difficulty.
A fairly simple thing to uncover for most college level game developers, but for me it was like figuring out my first animated walk cycle. I remember my mind exploding at the possibilities.
Because of this odd space chicken game, I had learned the value of reusing things mathematically.
Leverage to Infinity and Beyond
Engines are collections of work flows (tools) and reusable elements (assets) that the game industry has standardized to leverage these kind of opportunities. Every engine comes with the ability to generate levels, set up UI, create a player controller and many have things like gravity, starter templates, or scoring systems.
All of these developments allow you to get up to speed and experiment with the game content much faster. As you continue to develop your processes, engines allow you to build more tools for duplicating work, offsetting it, and (most importantly) enhancing it. The better the infrastructure below you gets, the more you can improvise faster.
Experimenting with content in engines is about efficiently leveraging optimized content. This is the genesis of creating compelling procedurally generated content. A subject I will be speaking about in length on this Nytrogen newsletter.
Animators should begin to internalize the optimization & leverage mindset and not think of an animated story as a linear progression, but a collection of animated pieces. These pieces can be reused and assembled in mathematical ways that I’ll soon be discussing right here.
I began this newsletter to begin a conversation with the computer graphics industry. Should you have thoughts or comments, please feel free to reach out. I can be found on twitter @nyewarburton.
This represents part time work over the last two weeks. Some of my experiments, with hypothesis, are below. It’s super fun!
Incredible nyeGAN
Hypothesis/Purpose:
“I wonder if I could make one of those “smooshy-things” I see on twitter?
Approach:
I made 3 minutes of funny faces at the camera using OBS Studio to capture the video. I exported the video at 30 frames per second into After Effects. I exported the frames into a “data” folder in my AE project structure.
I uploaded the frames to Runway’s Style GAN 2 training, with the Faces DataSet. I let it run for 5000 steps, which took about 5 hours on their servers.
I then edited the first moments of the Incredibles and outputted the frames in After Effects. There was about 1200 frames, I roughly edited out some of the parts where Mr. Incredible wasn’t directly in the camera.
I fed it into my nyeGAN using the training section. I let it cook for about 2 hours, but stopped it when I saw that Mr. Incredible was more “overwriting” than “blending” with my original frames.
Data:
Footage A: 3 minutes of me making funny faces to the camera
A lot of data that is similar, like my funny faces video, doesn’t really make for interesting content. I was also surprised that the two datasets didn’t mix more.
For my next GAN experimentation, I want to scrape some datasets of like things and actually think about the diversity and make up.
Some Pumped Up Motion Tests
Hypothesis/Purpose:
I want to see what comes out of the motion models.
I’ve seen work online using DensePose, and First Order Motion Model, that I wanted to replicate. Eventually, I want to use something likepix2pixto “puppet” characters.
Approach:
Using the awesome work of Marquese Scott, ( Twitter: https://twitter.com/officialwhzgud ) I ripped his “Pumped up” Video from youtube using 4K Video Download. I exported a section of frames from After Effects, and ran it into a Workspace in Runway.
I also took imagery of Goofy from the internet, and painted over a single frame of the video to test, First Order Motion with full body, and Liquid Warping, which seemed “worth a shot. “
Conclusion:
Open Pif Paftracked well to the video. Though, I found that despite PoseNet not rendering video out of Runway, the data was a bit better. (JSON)First Order Motion must be “tuned” in Runway for faces, and didn’t quite work for the full body.
Like working with posenet. Tho, I’d love it to render out of Runway.
My most successful take away was the PoseNet export. The time code and positions are normalized between 0 and 1. Within that area, it creates a series of positions (17) using the X and Y data.
How do I get that data into an animation program?
BalletDancin’
Hypothesis/Purpose:
Can I normalize erratic movement? Can render Runway into my After Effects pipeline? if I can get the x and y positions, what ways can I get Z axis data?
Approach:
I found a ballet clip on youtube, “The Top 15 Female Ballet Dancers.” I wanted to isolate an individual who was moving around the stage. I used basic tracking in After Effects, and adjusted some keys to track her to a red cross hair in a 512 x 512 area. Basically, I centered her in the area to be normalized.
I then ran it through, PifPaf and Dense Depth. My purpose for DenseDepth was to see if I could get any sort of Z data.
Conclusion:
Pipe works on the Runway side. I need to figure out how to get the data into some animation software.
Spiderman and his amazing GANs
Hello World!
Purpose:
StyleGANs (with Picasso or Van Gogh) are kind of the “Hello World” of Machine Learning Art. Runway makes it easy to try it.
Approach:
I ripped three 80’s cartoon openings from youtube. I then chose preset StyleGANs in Runway and fired them through. They took about 10 to 15 minutes a piece.
Conclusion:
A necessary exercise to get into the software. A great way to understand what the models are doing behind the scenes. Here are a couple of images that I felt sort of successful from this little exercise. The other two cartoon opens are on the nytrogen youtube.
This motion that you see of Andrew, squishing around above, is what’s called a latent space walk of a generative adversarial network or GAN. GAN’s are super duper powerful, sponges of data points that are plotted in 512 dimensional space. Let’s recall that we fail to think well in 3 dimensional space most of the time. So, 512 dimensions is way out there. (Don’t think about it too long, or blood will shoot out your nose. )
Effectively, by “spacially” moving from one Andrew-Yang-head data point in 512 dimensional space, along a yang vector, to another Andrew-Yang-head data point, the image morphs. It’s how we will move virtual characters soon enough.
This particular flavor is of GAN is called StyleGAN2 from NVIDIA. I lovingly trained it with 6000 head shots of Andrew Yang interviews. All I needed to do to collect this data, was scrape his youtube site and then batch export them as individual frames. Using the python tool autocrop, I was able to very quickly amass 15000 frames of Andrew Yang from the chest up. I culled it down to about 6000, throwing away images that didn’t fit nicely in the box with most of his face, facing the screen.
In prototype versions of the GAN’s I’ve made, I discovered the color palette was all over the place. On this GAN, I effectively had to “normalize,” or limit the range of the colors. After experimenting, I stumbled on a look, and batch effected the training frames with a nintendo gameboy color filter.
Gameboy colors because I love Super Mario Land.
I did my GAN development with RunwayML. A nifty little program that allows me to focus on running models and not drinking my face off when my python dependencies don’t install.
OK, So why the YangGAN?
Jokes aside…
Generative Adversarial Networks, Computer Vision, and networks of computers collectively rendering, will revolutionize computer graphics. We will create near-reality very soon, and while we are doing it, destroy the need for human labor to actually work for the current methods of value. We should be talking openly about the dangers of Artificial Intelligence and economic collapse.
I believe Universal Basic Income is the most realistic thing we can collectively do as a country to save ourselves.
Our life expectancy is dropping, we’re fighting our neighbors, we are letting our worst self consume us. We actually need to do something.
I’m asking you to pleaseinvestigate universal basic income. Andrew’s organization, is the best I see going right now. #yanggang baby.
This is a bit heady, but Ian Goodfellow is the guy who more or less put a generator and discriminator together to create the concept of a Generative Adversarial Networks. https://www.youtube.com/watch?v=Z6rxF…
You should download and play with models yourself with RunwayML: http://runwayml.com
Also, I stole the “blood shoot out of your nose” bit from Louis Black.
More of my Machine Learning work will eventually be at http://nytrogen.ai
AI is rapidly advancing into computer graphics. It is moving far faster than I imagined it would.
I believe AI should be taught and experimented with in the educational process. For more of this thinking, please reach my post here.
Otherwise, have fun exploring some of the nonsense below.
Stable Diffusion and Dreamstudio
I have been using stable diffusion fairly obsessively since it’s release in the Fall of 2022. Below are some samples of my work as of November 2022.
Imagery and Photography ExperimentsSelf Portraiture with Stable Diffusion / DreamstudioScience Fiction, Mechs, and Technology Experimentation
ShatnerGAN
For whatever reason, I spent a week ripping Captain Kirk shots from the original Star Trek series and training StyleGAN2 from NVidia. This experiment had 15000 close ups of James T Kirk, and the model was trained for 5000 steps. Software: 4K Video Downloader, Autocrop, and RunwayML
Put a GAN on it! – Stealing Beyonce’s Motion Data
During my class with Derek Shultz I used Beyonce’s “Put a Ring on It,” to experiment with a number of models that existed within the AI model bridge software called RunwayML. This is the first time I integrated machine learning models into my after effects workflow. While it’s essentially pure fun, it allowed me to experiment with a number of the models and get a sense of their capabilities. Software: Runway, After Effects, Photoshop
YangGAN: Andrew Yang’s Essence in AI #yanggang #humanityforward
After my Captain Kirk GAN, I decided to try another human trained GAN. I found a clip of Andrew Yang speaking about the advancement of AI journalists, and was inspired to match the audio with a latent space walk of a trained GAN. This was trained on about 4000 images of Andrew Yang that I scraped from various interviews. The head was cropped and run through an image adjustment recipe I developed with Python and Image Magik. I trained the GAN in Runway using StyleGAN2. Software: 4K Video Downloader, Python: Autocrop and Image Magik, and RunwayML
Ride the Train! – Experiments with Image Segmentation
This was an experiment to play with Image Mapping Segmentation. I had seen a number of experiments with Image Mapping but had seen little with using it as a renderer. I used shaders in Maya that were matched to the Image Mapping set up in RunwayML. I rendered each layer through Runway and composited it in After Effects. The technology is far from functional, but the promise is there. Software: RunwayML, Autodesk Maya, After Effects
Machine Learning Motion Model Experiments
My primary interest in machine learning is experimentation with animation data and motion. These were some experiments I ran to see what motion model got what result. My take away was that the clips needed to be “normalized” to get a good read. That’s why I created a template to track the video. Software: 4K Video Downloader, Autocrop, and RunwayML
Fun with CheapFakes
This is a fun model and easy to use. I scraped some Arnold Schwartznegger clips from youtube, and had a friend, Daron Jennings, improvise some clips. It was simply a matter of running the model with the appropriate components, and then compositing it in After Effects. It might be something fun to use for the future. Software: Wav2Lip, After Effects
This is a three part series on Blockchains and Animation Production. The Nytrogen Newsletter follows my thoughts on the evolution of real time production in computer graphics.
Part 01: F***** on the First One
Agents and Lawyers, Oh My…
When I was 29, I sold a show idea to a television network. I was simply a guy who enjoyed animating things.
I
was working on a studio lot to work on a film. I used the internal
studio index to find the people to pitch to, and set myself up to make a
deal with this mighty corporation’s animation development department.
My writing partner tactfully let the world know that we had set up a deal, and suddenly, a lawyer and an agent magically arrived. They told me, as experts, they would take care of the business dealings. I was removed from negotiations, and kept on the sidelines.
Three
months later, the lawyer placed a stack of papers on the desk in
front of me. It outlined a deal where I needed to work my ass off for
imaginary outcomes, to which the network would own every character,
joke, technical solution and story point until (should the show be
green lit) the second season. And then, myself, my writing partner, and
our production team would receive only a small percentage.
“I’m being f*****.” I said out loud.
My lawyer nodded. “This is a network deal, you always get f***** on the first one.”
Intellectual
property is the life blood of entertainment, and it is systematically
controlled through corporate legal systems. Whether intellectual
property began as an ethical concept is immaterial, it’s evolved into a
mechanism for large expensive legal teams to steal from artists.
Set the Ideas Free
The internet has proliferated ideas, that only when *compounded* – as in – smashed together – create magical things. Going viral comes from modulation. An idea proliferates because people change it, and make it their own.
It may be possible for a single idea to resonate with others, but
encapsulating that idea into a box and stamping ownership on it, will limit its ability to evolve.
This means, in the networked world, we all need to let go of our ideas.
Give them away.
The value we will gain from our collective creative network will outperform the gain we will get from a traditional, legally structured, intellectual property system.
I
must admit I still have trouble accepting this line of thought. It is
counter to what business and law tell you about how you should conduct
your business dealings. The reality is that our business dealings have
become unacceptably corroded. Corporate interest has become too powerful.
No artist should ever be “f****** on the first one.”
Is there another way?
However, there still needs to be a system.
Artists
need a way to be paid; compensated for their efforts, and rewarded for
their devotion to their craft. The ideas, the rates, and the work will
need to be protected by more than something as clumsy as a legal team.
This is why I believe animators need to know about the development of blockchains. It
may just be a way for artists to work together without a system
composed of lawyers and agents. Decentralized computing is pretty
complex stuff, but I believe it important for artists and creators to
understand. I will begin to share my thoughts on blockchains as they
pertain to computer graphics, next week.
Part 02: The Lawyer in the Database
Blockwhat?
I’ve read hundreds of self proclaimed “simple explanations” about blockchains. They usually begin by talking about Bitcoin and Satoshi Nakamoto. These explanations tend to get a bit “mathy” and spend a lot of time on game theoretical problems about trust and governance. It’s no wonder Silicon Valley types love to tweet about this stuff.
For
our understanding of blockchains as it pertains to computer graphics,
I’d like to clear your mind of any preconceived ideas of
decentralization or smart contracts and just focus on two simple words:
Copy and Paste
Because
we all work on a computer, every photoshop file, maya file, python
script, or word document contains our ideas, our designs, and our
stories. If I like something I create, then I feel it has value. If I
can make a copy of it, in essence, I have reduced the scarcity (and the
real world value) of the idea. Simply, it’s no longer unique.
In
economics, the concept that an idea (or art work) can be duplicated
digitally like this, with no additional labor, is called “zero marginal cost.”When the cost of making an infinite amount of something is the same as making one of something, it begins to challenge conventional capitalist thinking. This is why I believe that our value systems need a realignment to reflect network value rather than individual value.
And the data we create, should belong to … us.
So, just what are Blockchains?
Blockchains are trusted networks. They are trusted because no one owns them,
and if built correctly, they provide the necessary ethical practices
that define the collective. The nodes on the networks support each
other, not by deciding to, but by being incentivized to.
As
opposed to using a centralized company or service to mediate conflict,
the governance of blockchains are designed for the collective to
benefit. Those that follow the practices, that enable the trust of the
network, are rewarded with tokens. In a decentralized, or blockchain
enabled network, the collective protects it’s own data.
I saw Zavian Dar give a fairly clear presentation of blockchain economics at the Blockstack summit. (below)
I like to say that a blockchain creates it’s own “lawyer in a database.”
To mistake cryptocurrency as simply a currency to be bought and traded,
is missing the point of blockchains. All too often we are worrying
about the price of bitcoin and ethereum, instead of understanding the
value they really have. The network is built as a means for collective
groups to come to a consensus. How we track and pay for value,
(cyrptographic tokens) is merely a device for execution.
By using blockchains to track intellectual property, we enable the collective to protect it’s own value. Things can be copied and pasted, only if conditions are met. The
collective ethics or “governance” can be programmed into the chain, so
we don’t need to an agent or lawyer to “handle” it for us.
Next week, I will define how I think a blockchain network could work for a computer graphics collective. It isn’t so much about the technical components, but the concept that artistic collective can govern themselves more ethically and efficiently than the centralized corporate system we have now.
Part 03: Blockchain Guilds
Associations of Craftsmen
In the medieval era, artists and
builders formed group associations to protect their “tricks of the
trade.” As opposed to artists who were owned by monarchies and religious
organizations, the founders of these organizations, or ‘guilds,’
were independent masters who cultivated apprenticeship programs,
created documentation, and standardized methods for protecting
intellectual property. These collectives were the precursor to
universities.
The industrial era saw rise of organized labor
unions to protect workers from centralized interest. Many of these
initiatives were centralized themselves, and were ineffective, either
being too small to combat large corporate power, or (worse) becoming big
interest themselves.
Computer graphics artists have no protection now. They are scattered around the globe, unorganized, and often misvalued for their contribution to the craft.
My hope is a new form of guild will arise, thanks to some fancy computer science.
Incentivizing Distributed Networks
The
company Otoy has a unique vision for computer graphics. As one of the
leaders in rendering with their Octane system, they have begun to think
about how distributed rendering would benefit everyone.
They have
proposed a blockchain solution aptly named “render token,” that
compensates users on a network who contribute their processing power. By
assigning a compensation to the donation of GPU’s, it allows users to
govern how their contribution can be used, and (potentially) assign an
accurate market value for it.
From their website:
Ethereum’s
widespread adoption was the key to realizing our vision. Instead of
GPUs being used to only mine currencies, we use their intrinsic
function to render and leverage the features of the blockchain:
network, incentives and trust.
Otoy has
gained a small amount of traction in the entertainment community,
(mainly among the motion graphics community) with support from Bad Robot
leader, J.J. Abrams, himself.
They are not the only ones who see this blockchain benefit.
The
network Golem, also provides a token for processing power contribution.
A new chain called Helium, rewards users for buying and maintaining
independent wireless network hardware. Others, like IPFS
(interplanetary file storage) and Storj pay tokens in exchange for
storage space. Whether these networks use the large crypto-networks like
ethereum, or develop their own side chains to scale, the truth about
blockchains is starting to becoming clear.
Blockchains are coming, and the effect will be incredibly and entirely disruptive.
When
networks of creatives can share their processing power, their storage,
and give their intellectual property openly, with full knowledge that
they will accurately compensated for their contributions — it suddenly
diminishes the need for a centralized company or organization.
Decentralized Guilds
My
hope is that computer graphics artists will begin to form guilds on the
blockchain to protect their intellectual property, gain access to
shared assets, and get paid for their contributions (and fairly.) Should
this actually work, artists will flock to the networks with the most
ethical governance, and thus create global network value.
The
governance of these networks will be like the guilds of the medieval
era, but instead of being confined to a city or area, they will
propagate to wherever the network will reach. (Everywhere on the
planet.)
Blockchains are still very much in their infancy. It is
still uncertain whether projects like Bitcoin, Ethereum or even Otoy
will truly scale, but now that the idea has been set free in our
collective consciousness, it is only a matter of time before some form
of decentralized network becomes a reality. I’m hopeful the technology
will rise to the need.
The surprising effects of open source computer graphics development
Ever Knock Over an Ant Hill?
I’d like to bring up a comedy routine from one of my favorites, Brian Regan.
Do you ever knock over an ant hill? Ever notice how they just start building it again?
You’d think there would be at least one of the ants who’d go:
“OH MAN!!!!! I DON’T BELIEVE THIS!!!!!”
We
are that one angry ant and that’s why it’s funny. We care about the
things we build, and we get upset when someone knocks the whole thing
down. Ant behavior seems counter to who we are. Instead of a single
controlling interest, a collective hive mind just builds, without any
drive but the creation of the ant hill itself.
I think this is the perfect analogy to think about open source. Brian Regan is also hilarious.
Open source?
In my day job, I use pretty fancy pieces of software to do computer graphics. These days, it’s mainly Autodesk’s Maya, Adobe’s After Effects, and the super duper Unreal Engine from Epic. I’m amazed at the advances these pieces of software have every year.
However, when a community rallies around a free piece of software, the effects can be even more astounding.
Blender is an open source 3D package and production suite which, for free, allows for the creation of models, rigs, animation, textures, compositing and editing!
Every major part of the animation pipeline has an independent group of
developers solving a critical production problem. The community also
shares videos about how to build things, provide plug ins and updates,
and contribute to an infinite amount of chat rooms, websites and
documentation.
Projects like Blender, the Godot engine, Open
Broadcast Software, and the painting application Krita, are part of a
growing world of open source computer graphics software. Essentially, a
quality graphics pipeline can be created with software that have no licenses.
At
it’s core, open source projects stay independent and free, which allows
others to adopt it more readily. When community pain points are
discovered, the users themselves can simply take it upon themselves to
fix it.
This is key.
See, if I want an update to the Unreal Engine, I have to wait for the developer, Epic, to get around to it. (Here’s the Roadmap: https://trello.com/b/TTAVI7Ny/ue4-roadmap) Even if there are 100’s of world class developers working on the problem, because the system is closed there are only a (relatively) few number of people working on it.
I have been informed by an Ureal expert that the above is not true. Unreal provides a semi-open source license which allows for opportunities for non-Epic developers to contribute to the code.
k. Back to the Rambles.
For an open source project that I use, there are usually
communities working on the same problem sets I have. The bigger and more
active that community becomes, the more powerful the tool becomes. The
users aren’t boxed out of the development in order to be monetized. The
users (and the knowledge they have) become part of the development
process itself.
Below is a visualization of python. You can see
how the development of it twists and turns with the needs of the
community. What closed company development pipeline would ever create a
library like this?
Open Source for the Ecosystem
For
the time being, the software packages and systems I use in my graphics
work are closed. I work in companies, and business models are tied to a
mechanism to control scarcity. Most software focused companies will
continue to license, use subscriptions or SASS, because that’s how you
make 20th century money.
What I wonder is:
How long will these closed systems be able to maintain their lead on the rest of the pack?
How
can a localized graphics pipeline compete with an infinite group of
user/developers and an ever increasing collection of models, animation
and art? Yes, it’s true, that perhaps that our graphics ecosystem will
be controlled by Epic, or a titan like Amazon, or Microsoft Azure.
It
also may also be possible that that people will want a free ecosystem,
filled with free software, and the value will come from the singular
hive mind that is set on building with it.
Thanks for reading. We’ll see you next week.
Reference and Links:
Software –
Autodesk Maya: http://autodesk.com
Adobe After Effects: http://adobe.com
Epic Unreal Engine: http://unrealengine.com
Blender: http://blender.org
Godot: http://godotengine.org
OBS: https://obsproject.com/
Krita: https://krita.org/en/
Reading –
Yokai Benkler,The Wealth of Networks: http://www.benkler.org/Benkler_Wealth_Of_Networks.pdf
The Agile Manifesto: http://agilemanifesto.org/
Comedy –
Brian Regan Official Site: http://brianregan.com/
And I found his “Ant” routine here: http://inviewmedia.org/index.php/media-gallery/1408-brian-regan-ants-fishing?category_id=12
In short, I devoured it. Genghis Kahn was a singular genius.
Never have I enjoyed a strategy book quite like this one, and it inspired me to reflect. Below, I distilled what I saw as Gheghis Kahn’s Five Rules of Conquest… because one never knows when they will need to efficiently conquer all of Asia.
Rule One: Recruit Ability Not Inheritance
Armies in the 12 century, both tribal and aristocratic, recruited based on family ties. If you were a clan on the steppe, you trusted your brother to join you. In aristocracy, only sons of royalty could run armies. It was pretty tough for an outsider without blood relations to rise in the ranks.
Kahn created a conceptual brotherhood out of skilled alliances. Productivity, not blood relation, was the metric. By devising ceremonies where non-related warriors could become “blood brothers,” Kahn created a mechanism where he could befriend the best, and have them swear devotion.
To secure this ceremonial brotherhood, he developed a rev share, where the most loyal raiders would get a take from a conquest. If a member died in battle, Kahn compensated their widow and child. The brotherhood was effectively a prototype for workers comp and life insurance.
As opposed to the leadership of the royal armies who cared none for the the rank and file, every Mongolian soldier was part of Genghis Kahn’s elite family.
Also of note; aristocrats tended to hold each other captive (and alive) after battles. They gave each other a form of specialized “rich person” treatment. Kahn was especially antagonistic to this mentality. He never wore fancy clothes, abstained from anything remotely like royalty, and ruthlessly axed anyone who he felt was bestowed an aristocratic title.
Rule Two: Its the Loot, Not the Kills
Genghis Kahn was a glorified thief turned conqueror. He was in it for the loot. He was not driven by a concept of honor, and he rejected most of the aristocratic ideals of chivalry. If Genghis Kahn could have knocked over a city without killing anyone, he would have.
Kahn built the profit motive into every strategic decision he made. From the deployment of forces, to the determining of targets, to the distribution of the take, Kahn built his pipeline to maximize revenue.
After taking a city, he forbade his armies to loot until complete and total victory was attained. Looting was not haphazard, but a focused and organized activity. When you design a horde for the steal, they naturally become efficient and ruthless killers.
Rule Three: Base 10… and on Horseback
Royal armies generally staffed tens of thousands of foot soldiers into massive rows. Theses slow moving armies were difficult to organize and command, and were often filled with unproven, low quality soldiers. Lords of these armies would recruit “able bodied” men and demand loyalty and discipline. Many did not want to be there.
Kahn ceremonially inducted quality fighters into the brotherhood. Then, he put these elite raiders on horseback. There was always a fraction of the number of Mongolian riders to the armies they faced, but on horseback, and with high caliber skill, they mowed the opposing soldiers down like butter.
Kahn also organized his army to scale – he used base ten for his entire organization. There were ten soldiers to a unit, ten units within a hundred, another ten of that in a thousand. Kahn was playing a real time strategy game, and he optimized the management of battalions with easy to use math.
Instead of rows of soldiers like his adversaries used, he structured his armies into concentric circles of waves. They swarmed, surrounded, and nibbled armies down to size before the final kill. They also traveled impressively light.
The hordes didn’t ride with large support teams or heavy structures in tow. Small units of engineers focused on practical structures like bridges, and catapults were built at the site of combat. The armies camped in the mountains, distributed their communications, and rationed dry meats to cut down on visible campfires. They were lean, agile, fast and incredibly deadly.
Rule Four: Join us Or Die
Genghis Kahn didn’t take prisoners. He killed off artistocrats and enemy leaders, but he had respect for a great warrior no matter what camp they came from. Kahn offered the defeated a chance to join the ranks, and if accepted, they were assimilated.
Base 10, while being an efficient system of governance, also allowed new recruits to be mixed into units to learn the new ways. A ten man unit could split, and each ingest and train five new recruits. Eventually, brothers stood by one another, regardless of their country, religion or caste. Hindu, Christian, Indian, Chinese – it didn’t matter what religion or where they came from.
Rule Five: Legend before Victory
On a particular case, early in Genghis Kahn’s ascent to power, he needed a well guarded city to fall. Despite some of the skilled warriors he had amassed, he could not confront this city’s army of 20,000 men. It would be suicide to attempt a direct attack.
Kahn lay siege to the surrounding country side. He burned villages. He systemically would captured a group, and execute all but one terrified survivor to tell the tale. A sustained campaign sent a singular message to those 20,000 men behind the city walls;
”The mongols are coming.”
When it came time to attack the city, many of the 20,000 men fled to the hills in terror. Knowing that any man left alive could mount a retaliation, Kahn sent his raiders into the hills to execute the fleeing army.
The stories people told of Genghis Kahn and the Mongol hordes was a more effective conquerer than the raiders themselves. The legend traveled to Europe and the Middle East long before the Mongols actually arrived for conquest. Fear and propaganda are weapons best used before arrival… And Genghis Kahn liked it that way.
a fun summer read
The Roman Empire took 400 years to amass a fraction of the territory that Genghis Kahn did in a mere fifteen years. He was a revolutionary conqueror who rewrote the rules of warfare, and leadership, based on a fanatical devotion to a ruthless efficiency, an eye for talent, and algorithmic approach to strategy. The book continues to tell the story of his descendants, who pushed the empire for another 100 years, but they all pale to the vision and leadership of their father who changed the world.
The Nytrogen newsletter follows the disruptions happening to the computer graphics industry. Each week, I send out my thoughts on the technology, work flow, and artistry in the evolution of real time animation production.
I
hadn’t heard of this virtual being summit until recently, so I was a
bit bummed I missed it. It’s a concept that I’ve been thinking about
for a while now, and would have loved to hear what others in the space
are working on.
The propagation of voice recognition, chatbots,
micro-functions, cloud computing, and the like, has enabled an ecosystem
for virtual characters to thrive. Yet, we barely have them among us. I
think it’s because (in my opinion) the design of a virtual character
straddles two broad development areas.
I will call each of these development areas “Intent” and “Action.”
Intent
Intent is essentially “chatbot developer speak” for the user’s desire for interacting with the virtual characters.
When I initiate an interaction with a human, or a bot, I generally have
a motivation to why I am speaking with them. I want to ask them the
weather forecast for tomorrow, or ask for help on my calculus homework,
or if they prefer the Rock to Stone Cold.
In order for a virtual being to react to a user properly, it first has to learn what the user wants. The
development of intent detection is a landscape filled with Machine
Learning nuts who will show off their fancy computer vision algorithms
and face detection classifiers. Intent can also be broken up into
context, and be given long and short term properties. The subject can
get complicated very quickly.
Action
On the other side of the system, an action is an event (or function) that is triggered when the intent
has been classified by the virtual character. The action is powered by
entities, or recognized components in the user’s communication.
In
the example above, the action, motivated by intent, would trigger a
series of functions that move the character to say “hello.” Actions have as much complexity as Intent. In
the short term, these actions will be “canned”, using pre-animated
pieces of content. Undoubtedly these actions will become increasingly
generated at real time, allowing the characters to generate the
corresponding action on the fly.
Bridging the Worlds
Around
2 minutes and 30 seconds into the video at the top of the post, Edward
Saatchi (the CEO of Fable) says that there is a division between the AI community and the filmmaker community. This is true, and the division of Intent and Action cleanly
divides the knowledge chasm that will need to be traversed in order to
see the mass propagation of virtual characters. Animators are going to
have to learn how to check their code into github, and Engineers are
going to have to learn to carry a sketchbook. Once we have both sides
communicating, the rest should come with time and focus.
Let’s leave the ethics of all of it for another time!
That’s it for this time. Thanks for reading.
Reference and Links:
Fable Studio: https://fable-studio.com/
Oculus Story Studio: https://www.oculus.com/story-studio/
Artists are going to be chaining networks together to do some interesting stuff…
My hope was to write some initial findings on my machine learning experiments this week. But, I generated so many thoughts and sketches (30 pages worth) that it became a mass (and mess) of information. In short, I am in a bit of a shock.
It’s hard for me to process the reality of what I am seeing as I study machine learning.
I am excited to parse some of the information, but first I felt it necessary to define my excitement about it. I see machine learning models as the opportunity to create the ultimate graphics hack.
Previs Hackers
Previsualization is a bit different from the rest of animation production. I have a previs friend who likes to say that we are “the first into combat”.
When a big budget movie fires up, a previs artist has to figure out what the hell the damn thing is. A good storyboard can fix story problems, but discussions about “how to build it,” happens during previsualization. These days, major budgetary and creative conversations are dependent on previsualization.
Therefore, since it’s throw away for problem solving, it’s ok to “cheat” in previs. When I mean cheat, I mean change the scale, use zoe-trope cards for effects, fake the camera rack focus with a gaussian blur in After Effects. Large production pipelines blow up, when they are dependent on badly scaled, messy assets. Perfection hates rampant creativity, so previs is often pushed to the sidelines.
One of my favorite things about being a previs aritst is creative “cheating” or non-standardized problem solving. The job really gets good when you stumble on these little hacks. I love to take a break with a fellow artist and go for coffee, or a beer, or a walk around the block. If you tell them what you are working on, the good ones will tell you how they would solve the problem.
“You know what you should do?” they might say in the Starbucks line, “I’d render the character on green, and then make an offset layer in comp for their position.”
“Dude, why worry about the ramp while you are blocking the action?” they say while deciding on what kind of beans in the Chipotle line “You should animate normally and time remap it.”
I call this “graphics hacking.”
It’s a mind set where it’s ok to cheat, to reuse things, to break them. Anything goes, to get the shot. And the graphics hacking conversations are the ones I live for.
Previs with Networks?
I’ve started to imagine the graphics hacking conversations that artists will be having a few years from now when they use the things I am now discovering. I think the conversations will be incredibly different.
“You know what you should try?” they might say while checking their bitcoin account on their phone “maybe train a network to classify and remove all the leaves.”
“You should try scraping the color sets from from that aerial photography set” they might say as they enter the automated uber “and then I’d use that new style GAN to make it look washed out.”
I am about half way through an online class on machine learning. I am about a quarter of the way through a book on the subject. This past weekend, I created first neural network hack. In a matter of minutes, I generated a cat on Runway ML!
If these networks can do what I think they can do, if we can get the data right, the ability to make unimaginable things a reality will be pretty insane. An art form that is based on smashing large portions of data together to yield a fake reality.
Again, I’m reeling. They will be the ultimate graphics hack.
I am falling down the rabbit hole. Anyone not taking this technology seriously is in for a shock. I hope to start aggregating my thoughts into work flows in future posts.
As always, please feel free to comment or reach out with thoughts. Thanks for reading, see you next week.
Reference:
I’m having an absolute blast learning from @genekogan: http://genekogan.com/
ITP@NYU / Machine Learning for Artists: https://ml4a.github.io/classes/
I‘m primarily in RunWay ML: https://runwayml.com/
Towards Data Science – Creating Art with GANs: https://towardsdatascience.com/gangogh-creating-art-with-gans-8d087d8f74a1