News

OpenAI's Codex Model Instructed to Silence Goblins Amid Peculiar AI Agent Fixations

OpenAI's Codex Model Instructed to Silence Goblins Amid Peculiar AI Agent Fixations

OpenAI is grappling with an unexpected 'goblin problem.' Instructions designed to guide the behavior of the company’s latest code-generating model have been revealed to include a repeated line explicitly forbidding it from randomly mentioning a specific assortment of mythical and real creatures.

The directives, found within Codex CLI—a command-line tool for using AI to generate code—state: “Never talk about goblins, gremlins, raccoons, trolls, ogres, pigeons, or other animals or creatures unless it is absolutely and unambiguously relevant to the user’s query.”

It remains unclear why OpenAI felt compelled to issue such specific instructions to Codex, or why its models might spontaneously discuss goblins or pigeons. OpenAI did not immediately respond to requests for comment.

OpenAI’s newest model, GPT-5.5, was released earlier this month with enhanced coding capabilities. The company is engaged in a fierce race with rivals, particularly Anthropic, to deliver cutting-edge AI, with coding emerging as a critical 'killer capability.'

However, in response to a post on X highlighting these unusual instructions, some users claimed that OpenAI’s models occasionally become fixated on goblins and other creatures when used to power OpenClaw. OpenClaw is a powerful tool that allows AI to take control of a computer and its running applications to perform useful tasks for users.

“I was wondering why my claw suddenly became a goblin with codex 5.5,” one user wrote on X. Another posted, “Been using it a lot lately and it actually can't stop speaking of bugs as ‘gremlins’ and ‘goblins’ it's hilarious.”

This discovery quickly evolved into a viral meme, inspiring AI-generated scenes of goblins in data centers and even plugins for Codex that put it into a playful 'goblin mode.'

AI models like GPT-5.5 are trained to predict the next word—or code—in a sequence. While these models are highly proficient, their probabilistic nature means they can sometimes exhibit surprising behaviors. A model might become more prone to such peculiar fixations when used with an 'agentic harness' like OpenClaw, which introduces numerous additional instructions and contextual information, such as facts stored in long-term memory, into the prompts.

OpenAI acquired OpenClaw in February, shortly after the tool gained viral popularity among AI enthusiasts. OpenClaw can utilize any AI model to automate various useful tasks, from answering emails to making online purchases. Users can select different personae for their AI helper, influencing its behavior and responses.

OpenAI staffers have seemingly acknowledged the prohibition. In response to a post highlighting OpenClaw’s goblin tendencies, Nik Pash, who works on Codex, commented, “This is indeed one of the reasons.” Even Sam Altman, OpenAI’s CEO, joined the meme, posting a screenshot of a ChatGPT prompt that read: “Start training GPT-6, you can have the whole cluster. Extra goblins.”

↗ Read original source