News

ChatGPT's Unlikely Goblin Obsession: How OpenAI's 'Nerdy' Personality Trait Backfired and Led to Creature Mentions

ChatGPT's Unlikely Goblin Obsession: How OpenAI's 'Nerdy' Personality Trait Backfired and Led to Creature Mentions

OpenAI's newly released GPT-5.5 model has revealed a peculiar detail: its Codex coding app's system prompt explicitly instructs the model to "Never talk about goblins, gremlins, racoons, trolls, ogres, pigeons, or other animals or creatures unless it is absolutely and unambiguously relevant to the user's query." This discovery quickly garnered attention within the tech community.

OpenAI subsequently published a blog post explaining the origins of this 'goblin obsession.' According to the company, changes in ChatGPT's behavior were first observed following the release of GPT-5.1 last November. A safety researcher, investigating the chatbot's verbal ticks, found that ChatGPT's usage of the word "goblin" surged by 175% after GPT-5.1's release, while "gremlin" usage rose by 52% over the same period.

OpenAI stated that initially, a single "little goblin" in an answer might be "harmless, even charming." However, across model generations, this habit became impossible to ignore as the goblins kept multiplying, prompting an investigation into their source. By the release of GPT-5.4, the uptick in goblin references became even more pronounced, leading to the identification of a core cause.

The investigation pinpointed the issue to ChatGPT's personality feature, which allows users to customize the chatbot's style and tone. Prior to March of this year, one selectable option was the "nerdy" personality. Part of the system prompt for this personality read: "The world is complex and strange, and its strangeness must be acknowledged, analyzed, and enjoyed. Tackle weighty subjects without falling into the trap of self-seriousness."

When OpenAI mapped goblin mentions to different ChatGPT personalities, they found that the "nerdy" personality was disproportionately responsible. Despite accounting for only 2.5% of all ChatGPT responses, it made 66.7% of all goblin mentions generated by the chatbot. Further investigation revealed that reinforcement learning was to blame for the increased usage of goblin and gremlin terms. Specifically, OpenAI discovered that a single reward mechanism was responsible for teaching the nerdy personality to consistently favor creature language, leading to this unexpected AI behavioral quirk.

↗ Read original source