Upside Down
AI research in the form of a comic using Midjourney, Diffusion Bee, Astria and Dalle-2
Role
Screenwriter
Storyboarder
Art Director
Ai Prompt Engineer
Designer
Duration
12 Weeks
Tools
Midjourney
Astria
Dalle-2
Diffusion Bee
Photoshop
Figma
Team
Solo
Overview
The goal of this project was to create a comic book using AI tools in tandem with my own skill to develop a better understanding of how to incorporate AI into a design workflow
The Problem
Despite the daunting potential of AI generators, how might AI realistically be incorporated into a workflow to help convey an idea, and used as a tool to speed up the design process?
The Solution
To develop a stance on AI and have a general understanding of their capabilities as well as their limitations.
The Story
The inspiration for this story was largely pulled from my own experience as a woman working in a male-dominated industry. Being a minority can be a very isolating experience, yet you wouldn’t know unless you have lived it. AI’s limitless capabilities were an opportunity for me to create a world that flips the usual narrative in hopes of inspiring empathy and understanding amongst readers.
Research
“Robots are taking over” and “I’m going to lose my job” are some of the many sentiments you hear in the dawn of artificial intelligence. People are skeptical and wary of AI tools—and for a good reason. Not only can it feel like a mockery of creatives who have spent countless hours honing their craft, the risk of copyright infringement and eyebrow raising use of people’s data can make it seem daunting and scary. But as with any new technology, it can be a double edged sword.
We're just seeing the beginning of AI technologies and generative tools. I think we're about to enter a creative renaissance of unlimited potential."
—Erik Fadiman, Design Professor
Polls
To see what my peers thoughts on AI were, I made a poll on my instagram story. Out of 75 respondents, 33% stated they were skeptical, but open to learning more, while the leading score indicated that people were outright dismissive or conlicted on how they felt about AI. Only a sliver was completely open to using generative tools.
Ai Tools
I wanted to have a clear understanding of what kinds of generative tools were out there, how they worked and what their capabilities were. The main ones I turned to were Astria, Midjourney, Diffusion-Bee and Dalle-2.
Astria
Text to image generator that has the ability to create unique models based off photos provided by the user.
Midjourney
Text to image generator accessed via discord with very high quality visual outputs.
Diffusion-Bee
Text to image stable diffusion model that has the ability to edit images with in-painting and out-painting tools.
Open Ai
Open AI that not only has text to image models like Dalle-2, but includes text to text generators like the notorious chat GPT.
Modeling in Astria Round I
Astria AI generator is able to create unique models from a set of images that the user provides. As a test run, I trained Astria to create a model based off of my features by giving it 30 images of me doing various activities such as playing the saxophone, hula hooping and posing for the camera.
Modeling in Astria Round II
Although these images were able to accurately depict my face with a few minor discrepancies, the way in which I was operating with my surroundings was highly unrealistic. My next challenge was to train Astria on how to accurately depict a woman playing the saxophone. I made sure to include clear images of women playing the tenor saxophone in real-life settings as well as a variety of angles and camera perspectives.
Inputs
Outputs
Diffusion Bee
Some AI programs such as Diffusion-Bee are capable of filling in the gaps and expanding on an image output, known as “out-painting.” With this image of me that was created in Astria, I wanted to expand out the picture so it included an engaging background.
1
Original image generated in Astria and submitted to Diffusion-Bee.
2
Placing in Diffusion-Bee and “painting” with squares that have text prompts indicating what the background should look like. When placing a square, I always had to make sure I included part of the image to give the generator context.
3
I was able to crop out the parts that didn’t quite fit and isolate the background sections that looked convincing to achieve an engaging background.
The Process
With AI, there is not just a one-size-fits-all solution. The process of learning involved a lot of experimentation, trial and error as well as improvisation. This is where my experience as a jazz musician came into play.
Story Development with Chat GPT
I used text-to-text AI generator "Chat GPT" as a tool to get a general outline of my story. Although Chat GPT gave me a workable starting point, it was lacking proper points of tension and narrative momentum.
Pixar Story Spine
I reiterated on my story by following the "Pixar Story Spine" method. The story spine is essentially a series of written prompts to help give a story momentum and proper points of tension and release in the hero's journey. If my story didn’t quite fit the story spine, I filled in the gaps with my own ideas. I thought this served as a applicable way to use AI tools with my own experience and skillsets in a real world setting.
Once upon a time...
there was a Saxophonist named Jack.
And every day...
he played and practiced his instrument.
Until one day...
his colleague Skye undermines his debut performance at a prestigious jazz club.
And because of that...
he throws his saxophone in the trash.
And because of that...
he isolates with his camels.
And because of THAT
he grows bitter and resentful.
Until finally...
his grandmother tells him a parable that changes his perspective.
And ever since then...
he continued to play saxophone and never let other people get the better of him
Image Style
In figuring out how I wanted my comic to look style wise - I began to improvise with my text prompts in Midjourney. Each style was unique in it’s own way, however I decided that concept art style would convey my story the best in it’s ability to be hyper realistic yet painterly and semi-stylized.
Art-Deco Dystopian Style
Paper Cut-Out
Mono Screenprint
Concept Art
Ralph McQuarrie
After researching concept artists, one artist in particular came up over and over again. Ralph McQuarrie, the art director for Star Wars and E.T. Although I felt conflicted about creating work in the style of another artist, I was able to create more vivid and convincing artwork when I used a known artist as a takeoff point. All image generations in the comic explicitly acknowledge the influence of Ralph McQuarrie in my image generations.
Character Conditioning
Developing consistent characters was another challenge unique to AI. Although I had a descriptive set of words that I used for each character, the results all came out different while using Midjourney. However after doing some research, I was able to train Midjourney to create consistent characters in what is called character conditioning.
Step 1
Enter Prompt
I would ask Midjourney to create a character based on a unique description and name
“Jack reeves, man, dirty blond hair, anxious demeanor, blue jeans in the style of Ralph Mcquarrie, concept art, painting, comic book.”
Step 2
Upscale and React
I would ask Midjourney to upscale the image that looked most similar to the prompt I provided. I would then respond with a heart emoji, which tells Midjourney to continue making images similar to the one that I reacted to.
Step 3
Request Seed
A seed is a unique number that serves as a digital footprint for each image. Similarly with the heart reaction, if I want the seed number for my favorite image I respond with the envelope emoji. This tells Midjourney to send me the seed number for that image so I can use those pixels as the foundation for the next image.
Step 4
Copy and Paste
A seed is essentially a unique number attached to an image that serves as a digital footprint. Similarly with the heart reaction, if I want the seed number for my favorite image I respond with the envelope emoji. This sends me the seed number for that image so I can use it a pixel foundation for the next image.
Step 5
Repeat
These 3 steps were repeated until I had a character that I felt happy with. The seed and character description were saved as an “option-set” with the value of --Jack. This made it so that whenever I wanted to create this character again, all I would have to type for his description would be --Jack.
Final Characters
Each character took about 6-8 sessions of conditioning in Midjourney. Each one has a final seed number as well as a custom option set.
Design System
I used Figma to create components and a system of parts for my comic book. This included what I call “vector expressions”, speech bubbles and page layouts that included margins and bleed.
Flipbook
Key Takeaways
-
Never underestimate how long a comic can take, even with AI generators.
-
Planning ahead saves a lot of time down the road.
-
Even though AI is doing some major heavy lifting, you still need to know what you’re doing!
-
Ai is a powerful tool that can be used to speed up aspects of the design process and help convey a concept or idea.
-
Although I was able to more or less create generations that matched my storyboard, I had to be open to improvisation and settling with images that didn't completely match my initial ideas.
Next Steps
Next steps for this project are to show people that new technology doesn’t have to be black and white. There is a workable gray area and we can incorporate these tools into everyday practices to speed up our workflow and create better results quicker.