The Potential and Challenges of AI-Powered NPCs in Video Games

A hot potato: Large language models and generative AI are subjects that most video game developers tend to shy away from. Despite the allure of using these tools to replace human labor, the intense negative reactions they provoke are more than most companies are willing to manage. Moreover, AI technology has yet to reach a stage where it can reliably produce quality content without human input.

However, these limitations don’t apply to everyday individuals. People are already starting to experiment with AI technology in existing games. Modding communities have begun leveraging platforms like ChatGPT to give voice to NPCs and followers in games such as Skyrim and Stardew Valley.

A Starview Valley modder by the name of DualityOfSoul developed a mod that employs OpenAI’s ChatGPT API to expand many NPC conversational trees in the game. Typically, players are limited to a few interactions with NPCs per day, but Duality’s “AI Valley” on Nexus Mods equips computer-controlled characters with enough conversational ability to engage in lengthy, free-form dialogues.

Another modder, Tylermaister, created a Skyrim mod utilizing the same API to develop a follower that can coherently converse on virtually any game-related subject. The follower, Herika, has at least a basic understanding of the map. Therefore, if a player asks her the location of Riften, she can accurately describe the hold’s placement.

In a project demonstration, a player asks Herika where Dragon’s Reach is, and not only does she respond with the correct hold, but she also understands that they are currently only a few steps from the keep.

While these mods offer an exciting use of LLM technology with the potential to enrich a game’s dialogue, they also carry several drawbacks. Primarily, there is the cost. The ChatGPT API involves expenses. The Verge reports that it costs only fractions of a penny per dialogue line, which isn’t significant, but it can accumulate, especially since costs scale per user. Additionally, players are accustomed to free mods, making this a considerable challenge.

Another point is that ChatGPT’s voice acting is unlikely to impress anyone. The robotic delivery will quickly become tedious, even with minor speed adjustments meant to simulate an NPC’s excitement.

In the video below, you can hear Herika’s speech tempo increase and pitch rise like a record player when the player mentions something thrilling. This emotional reaction is notable as the model can dynamically recognize the situation, but it falls short of creating a convincing response.

We’ve seen that OpenAI’s impressive GPT-4o is capable of much more realistic conversation with a lifelike voice. However, its personality remains as generic as ChatGPT 3.0, albeit with increased enthusiasm.

These models are designed to be polite, politically correct, and friendly towards users. This trait does not reflect typical human speech, especially in video games where you might encounter an NPC who dislikes or is angry with you.

Lastly, dialogues with chatbot-driven NPCs can quickly derail. Similar to using the web version of ChatGPT, the API is just as susceptible to hallucinations and may produce dialogue that is out of character or provide incorrect facts about the game world.

While the idea of conversing with an NPC like it’s your best friend is appealing, there’s still a long way to go. Coupled with the fact that LLMs are unpredictable and can disrupt a game’s intended narrative, it’s unlikely we’ll see widespread adoption of chatbots in video games anytime soon.

Scroll to Top