Discovering Disco Diffusion and Prompt Design

Dalle2 is a generative image model by Open AI. It’s a transformative step in computer graphics.

Dalle2 is in it’s early invite stage, which means, (as of July 2022) I don’t have access. (Hurry up Open AI, please?) Powered by a need-to-understand these models, I soon stumbled upon a model called “Disco Diffusion.” While not of the same power of Dalle, DiscoDiffusion is indicative of the future of generative media. And enormously fun to play with as well. Everything here was created with this DiscoDiffusion Google Collab and a free account.

—-> (https://colab.research.google.com/github/alembics/disco-diffusion/blob/main/Disco_Diffusion.ipynb)

My first animation test using Warp in DiscoDiffusion

AI Comics

Like most things, I start by clearing my creative blocks with comics. So, to find my footing, I thought about trying to get an AI to generate it’s own comics.

Since the joke of a comic is somewhat separate from the art, I “prompted” new sentences with the AI powered InferKit. This led me iterate on the writing, until I felt the AI had the joke it wanted to tell.

I took the reponse prompt from Infer, and then I fed the prompts into a freely available model called Craiyon (previously Dalle-mini) – using other tags for cartoonists and comics to shape the image.

text_prompts = {
    0: [
#Subject
"--- Output copied from InferKit sentence",
#Description
"two character","single panel", "cross hatching"
#Artist
"gary larson", "new yorker",]
}

The first few felt weird. However, liking the possibility, I experimented with a “publishable” version with the merge on photoshop. It felt more authentic, and put structure to the result.

I then tried the same exercise, but with Disco Diffusion. While the output turned more smushy, the options for experimentation within the collab notebook were numerous, and being open source, inspired a deeper dive. Soon, I think I had the knack of it.

AI Watercolor Paintings

While working, I set up another instance to do some landscape watercolors. I wandered through prompts of different places, like Savannah or Newport, but didn’t see much evidence of locations. I grew bored pretty quickly of it.

AI SciFi Concept Art

I decided to move into more concepting of science fiction. 1. Because, why not? and 2. I discovered Disco Diffusion has a PulpSciFi Data Set trained, which I switched to for a bit.

I experimented with key words like “Ralph McQuarrie” and “Star Wars.”

Finding other artists who are playing with the model online, I discovered the practice of “trending on art station.” It helped the results, however made me feel a little ill about the artists who are actually trending on artstation. It started to lead me to a recognizable paradigm and felt my prompts pulled into “cyberpunk” and “mech.”

AI SciFi Space Photography

I then found a really interesting breakthrough. I found someone who started listing camera lenses.

This inspired me to take the more conceptual ideas I was working with and make it more photographic. I chose prompts that focused on realism, with references like “NASA” or “Star Trek.” I also set detailed instructions for camera lenses, like “27mm” and “tilt shift.”

AI Alien Mech Technology Photography

But eventually I wanted to create something… alien I guess.

I began to organize my prompts into sections, and mess with individual variables to get more of what I was looking to generate.

I experimented with a number of variants with lenses and lighting, and even trying to Tilt Shift. The best results, according to me, are above.

Most of the imagery you see came from what is becoming a bit of a “base prompt” for me:

    "Mech Suit with military grade weapon systems in an alien world",
            "Sci fi, Iron Man, technology, attack",
            "photography",
            "27mm",

AI Animation? AI Shorts?

I’ve only just begun to explore!

If you want to see some extrodinary work, you should consider joing the DD discord.

https://discord.gg/fzevz8Z4

Mind Blown

The implications of the democratization of AI art will be extraordinary. The architecture of CLIP and the diffusion models, the design process of prompt engineering, and the intellectual property implications are probably more than this blog post.

As I wrap my head around the use, I’ll update here.

As always, thanks for reading. Happy prompting!

Please follow and like us: