Fashion is a ruthless business: Squeezing creativity from an LLM.
My household is pretty committed to using technology for creativity.

I’m a software designer. My partner, Russell, builds software products for creatives and designs synthesizers in his spare time. It may not be surprising that we’ve both spent a lot of time experimenting with LLMs and image generators. We also hear complaints from many people—including our parents—that AI is boring, mediocre, and uninspiring. So earlier this year, when we had overlapping vacation time, we decided to make a game. A game with AI. 

We were inspired by the quick party games created by Jackbox Games. Our goal was to design and build an interaction that highlights some of the creative strengths of AI systems and to use those strengths to create a game called Fashion Frenzy that’s actually fun to play. 

Broadly, the game works like this: Players take on the roles of up-and-coming fashion designers, competing for a commission from a fictional celebrity attending a red-carpet event. The celebrity shares their desires with the players, who then describe the garments they would create in response. The celebrity “looks” at their garments and assesses how well each meets the brief. The round ends with the celebrity choosing which garment they’ll wear to their event and sharing why.

A block diagram of the game

We wanted the AI to take an active role, drive the game forward, and provide a supportive environment that encourages player creativity. But we also wanted to lean away from the LLM’s default mode of being encouraging and positive. It requires care and effort to design a system like this (for example, in the system note, we stress that our setting is a game, not the real world). Establishing a fictional context helps the LLM provide snarkier, more direct critiques in the style of Project Runway judges.

Getting the prompt to work well took several rounds of iteration. Our first few playtests made it clear that the system thought “alien” and “concert on the moon” were great matches when asked to imagine a wild celebrity client and event. We ended up adding:

"You must not be an alien, and please try not to use a space theme unless you feel really passionately about it.” 

Then, later in the system note:

“Remember, aliens don't walk red carpets.”

This was to dissuade the LLM from focusing too much on space. Mimes are also disproportionately popular LLM tropes, but I, personally, didn’t mind that quirk and decided to leave it in. In a later version of the system, we fed the last five celebrity names and descriptions back into the system note to reduce the likelihood of repeat celebrity themes when playing multiple rounds of the game. This can be accomplished by saving previous games in a Firebase database.

After sharing the instructions, the game begins by giving players background on a celebrity client who needs an outfit for an upcoming event. One of my favorite examples is Flamboyanzo the Magnificent. (I still regularly think about the name “Flamboyanzo.” I’m not sure why I find it so compelling, but I think this might be why I kept the mimes in as well.)

“I’m Flamboyanzo the Magnificent, the world’s greatest mime turned experimental stage magician! I need an outfit that shouts louder than my silent acting, with a show-stopping hat that could house a thousand rabbits and a cloak that could hide an elephant. Bring on the spectacle!”

Using their phones, players have two minutes to describe an outfit that fulfills their client’s wishes and matches their personality.

Once all players have submitted, or the two minutes elapse, slightly conditioned versions of the players’ descriptions are sent back to the LLM for judgment and to an image generator for visualization

`Full body red carpet photo of ${celebrity} wearing ${prompt}`

The LLM judges how well each player’s prompt meets the celebrity brief and picks a winner. The game displays the resulting images and critiques in sequence before revealing the winner. We use prompt engineering techniques, such as asking the model to think out loud, to both improve the quality of the judgments and to, hopefully, create some funny content.

“A beaver tail? What kind of fashion catastrophe is this? However, the diamond lightning suit and ruby stilettos are quite interesting. And the magic wand with emeralds shooting out of it, now that's a spectacle. But laser gloves? This isn't Star Wars, honey.”

Beaver tail and sapphire hat.  Diamond lightning suit and ruby stilletos.  Magic wand with emeralds shooting out of it.  Laser gloves
Beaver tail and sapphire hat. Diamond lightning suit and ruby stilettos. Magic wand with emeralds shooting out of it. Laser gloves. —Fashion Frenzy submission by Murray

For image generation, we started by using the DALL-E API, as we’d already set up an OpenAI connection. This led to some interesting outputs, including a pretty fun interpretation of one of the celebrities as a llama. The human representation, however, was not sufficiently diverse in terms of skin tone or body type and often had poor composition. We explored other image generators and decided to go with Stability’s Stable Diffusion XL model because of its cost and ease of setup, balanced with image quality. This created a trade-off between the more fun, imaginative mistakes that came from DALL-E and the vastly better average image quality from Stability. I don’t think there will be more llamas in Fashion Frenzy’s future, but there will be more garments I’d want to wear.

A generated photo of a llama
DALL-E's (not DALL-E 3's) best llama.

We’ve played Fashion Frenzy often with our families and colleagues. Their responses have helped refine the game and given me confidence that we’re achieving our design goals. With less design- and technology-focused audiences like Russell’s parents, there’s a wonderful moment of delight and shock when the player’s garment image is revealed. These players don’t necessarily anticipate that their quick description will be transformed into a semi-realistic visualization. Plus, there’s an inherent satisfaction in seeing your own vague thoughts translated into something beautiful (or funny). When sharing with my IDEO coworkers, I’ve gotten lots of specific and insightful questions about the prompt engineering that enabled the game design, and even philosophical questions about the nature of humor.

Building this game was a fun and creative experience. I used AI-assistance tools like GitHub Copilot while authoring the game, leaning in as much as possible to an AI-enhanced process for my personal work. Part of why I feel so strongly about the creative possibilities of AI is that I’ve been using these tools and their predecessors in my process for years, in ways that I believe make my work better and more interesting. It’s true that Copilot makes me faster, but it also contributed to the prompt engineering of the system note.

The goal of the project was to make the AI’s creative possibilities clear to players. From our playtests, we have evidence that this is, at least to some degree, working! We also had to confront the human effort required to build a system capable of achieving this level of legible creativity. I certainly pushed the limits of my own creativity while performing tasks ranging from patching up the model’s behavior through prompt engineering to designing a serviceable style palette for the web app’s front end. While I think we eventually designed a system that creates a compelling, fun game loop with engaging critiques from the LLM, the variation of the LLM’s scenarios and way of speaking isn’t infinite. Some of this is to be expected. After all, the background context doesn’t change (much) between runs. The LLM is in its own version of the movie Memento. But it’s also true that, even with the care we put into our prompts, the system isn't creating superhuman levels of fun. 

As I write this post, I’m at home with my parents for the holidays. We decided to play Fashion Frenzy for the first time since GPT-4 upgraded to GPT-4 Turbo. While the quality of the celebrity names hasn’t changed much, the scenarios have become more, well, elaborate. Here’s an example:

“Hello to all you wannabes! I'm Morticia Crumble, the infamous multi-platinum kazoo player who once brought the Queen of England to tears with my rendition of ‘Bohemian Rhapsody.’ Now, I need an outfit for the annual Kazoo Symphony Gala that screams ‘extravagant,’ but also subtly, yet blatantly, shows my undying love for rubber ducks. I want you to use feathers, lots of feathers, making me look like a fabulous canary in a rubber duck pond. Let’s see what you got!”

We may want to re-engineer a bit to reduce the complexity.

Want to playtest Fashion Frenzy? Drop me a line at jfizel@ideo.com.

Repo: https://github.com/jftesser/prompt-draw

Heading 1

Heading 2

Heading 3

Heading 4

Heading 5
Heading 6

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur.

Block quote

Ordered list

  1. Item 1
  2. Item 2
  3. Item 3

Unordered list

  • Item A
  • Item B
  • Item C

Text link

Bold text

Emphasis

Superscript

Subscript

sdfadfas
Manage Preferences