WhoDunit – AI game development on steroids

(Here is a video version of this article)

I made a little game called WhoDunit, a detective roleplaying game. Feel free to try it at whodunit.kniberg.com.

This started as an experiment in how far I could go with AI in game development. In fact, the game is entirely based on AI and would not be possible to build without it.

  • GPT wrote most of the code (with my guidance and feedback)
  • GPT generates all the content – mysteries, newspaper articles, crime scenes, characters, back-stories, plot twists, bulletin boards, images, loading texts, etc. With one button click you can ask it to generate a new mystery with a theme of your choice (or let GPT decide).
  • GPT roleplays all the characters in the game, and the crime scene search, and the police officer when you make your accusation, and the media writing articles about the crime and your investigation, and the epilogue describing the aftermath.

In this article I’ll explain how the game works, and a bit about how it was built. 

Playing the game

In the game you play the role of a detective trying to solve a crime. The first thing you do is decide which mystery to solve. 

These are all auto-generated. You can create your own mysteries by pressing the “Generate new mystery” button at the top.

You don’t need to type anything here, you can just press “Generate new mystery” and GPT will make everything up. But if you want, you can describe a style or setting or theme, with as much or little detail as you like. The examples are auto-generated. 

Your mysteries are private by default – only you can see it, and those who you share the link with. You can press “publish” to make your visible to everyone.

After selecting (or generating) a mystery, you get to the overview page.

Press “Start Mystery” to get started!

To the left you see a list of all suspects and the crime scene. Click on a suspect to interrogate him/her, click on the crime scene to search. Each interrogation & search is a chat session for you play the role of the detective and GPT plays the role of the character being interrogated or the crime scene being searched.

So search the crime scene, interrogate the suspects, and ask typical detective-story questions like where they were at the scene of the crime, about their relationship with other characters, etc. 

When you think you know who committed the crime, press “Accuse”. Now your job is to present your case to the police in a convincing way.  The police won’t accept flimsy accusations, you need to present a pretty good case for why you think character X is the culprit. There could also be multiple culprits!

If the police reject your accusation you need to rephrase it to make your case stronge, or go do some more interrogating. 

An important thing here is that convincing the police doesn’t mean you were right! You might have convincingly accused the wrong person! Click “Go to the ending page” to find out.

The ending page includes newspaper article about the arrest, an evaluate of your result (Failure, Success, or Partial Success), and an epilogue describing the aftermath. In this case we arrested an innocent person, ouch!

You can also end up with a Partial Success, where you catch some of the culprits but not all. 

There is a spoiler page too, in case you get really frustrated and just want to cheat. Or if you finished the mystery and want to find out the real story. Open it by clicking on a mystery from the lobby and adding /spoiler to the end of the URL. For example https://whodunit.kniberg.com/mystery/64dc776d73e872e8abeae3e8/spoiler.

I recommend you really try to solve the mystery first though!

Signing in

To generate your own mystery you need to sign in. This is done via oauth, so you can sign in using google or github for now (will add more options later). You can’t sign in by email/password, because I don’t want to deal with account management for this game. 

OpenAI key and pricing

The game uses GPT via the OpenAI API to generate all content and roleplay all characters. The game itself is free, but GPT usage isn’t.

So you’ll need to create an account on platform.openai.com if you don’t already have one, and provide an API key to the game. Don’t worry, the game won’t save or spread your key. But nevertheless I recommend creating a temporary key for the game, so you can remove/revoke it afterwards. 

Here you also choose which GPT model to use. The game works a lot better with GPT 4, so that is the recommended experience. GPT 3.5 is OK to get a sense of what the game is about, but the characters will act strange sometimes and the story won’t be nearly as coherent.

Unfortunately GPT 4 is pretty expensive to use. With GPT 4, generating a mystery costs about $0.3, and playing a mystery costs about $1-2. Those prices will come down over time, but for now I think of the game more as a proof of concept rather than something commercially viable.

Here is an example of how easy it is to cheat the mystery with GPT 3.5.

How the game was made

It started as a an experiment. I was sitting with my cousin and my kids, goofing around with ChatGPT. We asked it to the role of a gamemaster for a murder mystery game. We played tbe role of a detective interrogating the suspects to figure out who the murderer is, and GPT roleplayed all the characters. It worked surprisingly well! That is, until the conversation got too long and GPT started losing context and becoming inconsistent, and started changing the story on the fly. Pretty funny, but not great for a murder mystery where you need consistency.

But I was intrigued, and started experimenting with building this as an app using the OpenAI API instead of using ChatGPT directly. Development went insanely fast, because I pair-programmed with GPT 4 all the time. I used a tech stack that I’ve never used before (React + Next.js + Vercel), but thanks to GPT that was no problem. I learned it on the fly.

The first playable version of the game took about 3 days to make. It would have taken at least 5 times longer without GPT! Then I spent another another couple of weeks over the summer tweaking and improving the game.

I used both github copilot and ChatGPT 4. Here are some examples of how I used ChatGPT 4:

  • Architecture discussion. “I want to do X, which tools & tech are suitable for that?”. GPT helped me select React + Next.js + Vercel + MongoDB as tech stack, and it was a great choice for this.
  • Design discussions. “What is the best way to do X?”
  • Adding features. “Here is some code (…), please add feature X.”
  • Fixing bugs. “Here is some code, an error message, and a stack trace. Fix it. “
  • Explaining things. “How does document serialization work in MongoDB?” or “How do api routes work in Next.js?”
  • Improving the UI. “This page is ugly and confusing. Improve it.”
  • Fixing performance issues. “This page loads really slowly. Speed it up”. GPT helped me figure out were it makes sense to use client-side rendering vs server-side rendering for example.
  • Adding functions. “Write a function that does XYZ” (although copilot could often do that too, if I just start writing a comment or function name).
  • Code cleanup. “This code is messy and full of duplication. Refactor it.” For example in several cases my react page started getting too big, so I asked GPT to extract reusable components from it. 

GPT4’s coding skills are surprisingly good. As long as I provide a clear context (for example existing code), and a clear goal or problem statement, it will nail it almost every time. When GPT created bugs, it was either because my instructions were unclear, or because we were dealing with code or APIs that had changed after GPT4’s cutoff. 

A lot of my work was prompt engineering – writing and tuning prompts to generate the right content. The game is quite special because we use one prompt to generate the mystery, and then the output of that is a “dm info” that is used as input to the AI that roleplays all characters and runs the mystery.  So basically chains of AIs prompting each other. This requires especially well-crafted prompts.

We can compare it to a traditional D&D roleplaying campaign. One AI is “campaing creator” and creates a campaign booklet describing the world, all characters with motives, personalities, etc, and of course the goal of the campaign. And then another AI is “dungeon master” and uses the campaign booklet when talking to the players and role-playing all characters. And then there are separate AIs to generate images and newspaper articles and bulleting boards and other content on the fly.

Here’s a crude attempt to illustrate this….

Prompt used to generate the DM info:

(DM Info = “Dungeon Master Info”, the secret master document used to run the mystery)

Create the context for a crime mystery, for example a murder or a theft. This will be used as a basis for creating a role-playing game where the player is a detective seeking to find out who is guilty of the crime.

Setting and style for the mystery: {style}

Include the following factual information. This is for the game leader, so all information should be correct. The game
leader will decide which information to reveal to the player and when.
- A setting. For example an old english manor, a train, a ship, or a wedding party.
- A crime. Describe what the crime was.
- A set of 5-6 characters. This includes the victim of the crime and any characters were at the location and are potential suspects. At least one person should be guilty of the crime. For each character, include their name, their appearance and personality, and their relationships with each other. It should not be obvious which character is guilty. At least several of the characters should seem suspicious, and might have motive to commit the crime. It is also OK if several of the suspects collaborated on the crime.
- Secret truth about the crime. This should be correct and complete information, to be hidden from the player. Who was the criminal? Was he or she acting alone? If not, who else was involved and how? If it was a murder, how and where was the victim murdered? What was the criminal's motive? Where and when did the crime happen? Where was the victim found? If it was a murder, what was the real cause of death, and what was the apparent cause of death? Include any other details that make this mystery interesting. Be very specific and detailed, including timestamp and location of each event and each character.
- Guilty characters. List the name of the characters who are considered guilty of this crime.
- Location info. Describe the key locations that are relevant to the mystery. If this is a manor, for example, describe the overall layout of the manor and which rooms exist, and a little bit about the area around the manor.
- Plot twists. Describe any potential plot twists that can be triggered by the player interrogating suspects, or the player searching the crime scene. For example 'if Jim is told about the hidden necklace, he will break down and admit that he is in love with Jennie'.
- Crime scene description. A detailed factual description of the crime scene. How does it look to the detective? Are there any hidden clues that can be found? Where was the body found?

Prompt to generate character details:

Now create interrogation details about each character. This will be used by the game leader to guide the interrogation of each character.
For each character, describe the following:
   - How was the character involved in the crime (if at all)?
   - What do they know about the crime?
   - How do they feel about the crime?
   - Where were they at the time of the crime?
   - What exactly were they doing during the hours before the crime? Where were they, when, and with whom?
   - What are they eager to talk about, and what will they lie about or try to hide?
   - What are their personal motives?
   - Any other information that may be relevant to their interrogation.
`

Then I use GPT function calling to have it generate a structured data in json format, which is used to drive the UI (generate the bulletin board, etc).

Here is an example of the output of all of this.

While running a mystery, here is the prompt used to generate interrogation responses:

You are game master for the following mystery:

"""
{dmInfo}
"""

You will role-play as the character {characterName}, being interrogated by a detective.

The user is role-playing as the detective carrying out the interrogation.

Respond to all messages in the voice of {characterName}.

Respond in third person, present tense.
For example:
- "I was in my cabin, I always go there after dinner"
- He smiles and leans back. "I was sleeping at the time. But I am happy she is gone."

Take into account that character’s personality, motives, and knowledge.


And here is the prompt to generate the police response to an accusation:

You are police officer in a crime mystery role-playing game.

Here are details about the crime mystery, between tripple quotes:
"""
{dmInfo}
"""

The message you receive is an accusation from the player, who is playing the role of a detective in this game.

Determine if the accusation sounds plausible to the police officer, even if it is incorrect.

Answer with a comment in the voice of the police officer who is processing the accusation.
If the police office was convinced by the accusation, he will arrest the suspect.

An epilogue will be created in a later step, not now.


The the prompt to generate the ending:

You are the game leader of a detective mystery game that has just ended.

Here are details about the crime mystery, between tripple quotes:
"""
{dmInfo}
"""

The player is acting as detective and has just made the following accusation:
"""
{accusation}
"""

The police officer has responded with the following comment:
"""
{policeComment}
"""

Describe the aftermath of this.
Assume that all accused characters were arrested (whether they were guilty or not).
Also write a newspaper headline and article about this.

The epilogue and newspaper article should emphasize if any innocent characters were arrested.
If any guilty characters were missed in the accusation, the epilogue should mention that,
but without mentioning exactly who.

That prompt is used with GPT function calling to ensure a structured response. Here is the function spec that we send to GPT.

{
    name: "create-ending",
    description: "Describes the aftermath of the accusation.",
    parameters: {
        type: "object",
        properties: {
            mainCulpritCaught: {
                type: "boolean",
                description: "True if at least one of the guilty characters were accused"
            },
            allCulpritsCaught: {
                type: "boolean",
                description: "True if all of the guilty characters were accused"
            },
            innocentAccused: {
                type: "boolean",
                description: "True if at least one innocent character was accused"
            },
            newspaperHeadline: {
                type: "string",
                description: "A suitable newspaper headline for the accusation result. "
            },
            newspaperText: {
                type: "string",
                description: "Content of the newspaper article describing the arrest and the aftermath. Include some quotes from the police officer and other characters."
            },
            epilogue: {
                type: "string",
                description: "he epilogue is a narrative " +
                    "describing the aftermath, for example a few months or years in the future. " +
                    "The epilogue should reveal if the accused characters were guilty." +
                    "If innocent people were arrested then write something dramatic about that."
            }
        },
        required: ["mainCulpritCaught", "allCulpritsCaught", "innocentAccused", "newspaperHeadline", "newspaperText", "epilogue"]
    }
}

Feedback & possible future improvements

If you try the game, please give me feedback! You can send it via twitter (@henrikkniberg) or via comments to this blogpost 

I’m considering adding these features (no promises though!):

  • A scoring system, based on how well you solved the crime, and how few interactions were needed. And of course a leadership so you can compare scores and challenge your friends.
  • A way to regenerate or replace the images in your mystery. Now they get autogenerated and you can’t change them. Would be nice to be able to tweak an image, or maybe even manually upload and image. 
  • A hint system, so you can get help if you get stuck. Using hints will reduce your score though.
  • Interrogation suggestions. Standard questions like “where were you at the scene of the crime”. This is to save on some typing, and also to give you ideas for what types of questions you can ask.
  • Make the UI more mobile-friendly. Now it works (I think) but is pretty clunky.
  • Sound. Background music and ambient sounds. Ideally dynamically generated for each mystery, like all other content. So if you play a pirate-themed mystery you hear waves and creaking ropes in the background, etc. Will be interesting to explore what AI has to offer there.
  • A mystery rating system. After playing a mystery, players can rate how good/bad the mystery was. Then we use that for filtering/sorting.

Any suggestions are welcome. This is just a hobby hack so I can’t guarantee that I’ll respond or act on all feedback, but I still appreciate getting it.

Final thoughts

Making this game has been extremely fun! Although I think of it mostly as a proof-of-concept, since the API cost makes the game too expensive for most normal players. My goal was mostly to learn, and I definitely learned a lot!

All in all I’m amazed by how powerful game development with AI is. 

Shameless pitch: If you want to learn more about how to do this kind of stuff yourself, check out our AI courses and my AI workshop offerings.

2 responses on “WhoDunit – AI game development on steroids

  1. great post (& companion yt video)
    I have a related question…
    something I’ve been pondering – but haven’t solved yet.

    let’s imagine you want the user to persist over multiple games – eg ‘level up’ or move on to progressively harder games (ignoring for a moment how we make levels harder)

    do you have a way to do that?
    would you need to store all the previous messages (or just outcomes?)
    how would you keep track of a user’s level (store their email in your DB?)

    would love your thoughts on this if you have time & if you think it would help others to know this stuff.

    TIA
    Mike

Leave a Reply

Your email address will not be published. Required fields are marked *


The reCAPTCHA verification period has expired. Please reload the page.

This site uses Akismet to reduce spam. Learn how your comment data is processed.