Large language models (LLMs) often exhibit valuable self-monitoring signals: they can estimate their likelihood of success before tackling a problem, and judge the correctness of their output afterward. However, these latent metacognitive signals have historically been treated as passive outputs rather than active inference-control mechanisms. In short, LLMs 'know when they know,' but lack the system-level agency to act on this knowledge during reasoning.
To bridge this gap, a pioneering study introduces a 'metacognitive harness' inspired by the Nelson-Narens theory from cognitive psychology. The framework decouples self-monitoring from the core reasoning process. Specifically, for each problem, the model first generates a pre-solve 'feeling-of-knowing' (FOK) signal, followed by a post-solve 'judgment-of-learning' (JOL) signal. Instead of letting these signals sit passively, the harness transforms them into an explicit control interface for dynamic reasoning.
In practice, the harness dynamically orchestrates the inference flow: deciding when to trust the current solution, when to trigger a retry guided by compact metacognitive feedback, and when to aggregate multiple attempts via a final consensus mechanism. This unlocks powerful test-time scaling capabilities without requiring any base-model parameter updates or benchmark-specific fine-tuning.
Evaluated on text, code, and multimodal reasoning benchmarks, the harness was applied to a fixed Claude Sonnet-4.6 base model. It dramatically boosted pooled accuracy from 48.3% to 56.9%, outperforming the strongest listed leaderboard entries on three premier benchmarks: HLE-Verified, LiveCodeBench v6, and R-Bench-V. This demonstrates that LLMs already possess rich latent metacognitive abilities, but require an external harness to fully operationalize them.
[AgentUpdate Depth Analysis] Traditional LLM self-correction often suffers from 'hallucinatory loops' because the model acts as both the executor and the evaluator within the same reasoning context. By adopting the Nelson-Narens cognitive model, this metacognitive harness successfully decouples the 'monitoring plane' from the 'execution plane.' Compared to existing agentic patterns like simple loop reflection or multi-agent debate, this FOK/JOL dual-signal control serves as a lightweight cognitive OS kernel. It enables proactive 'fail-fast' behaviors through FOK and precision retries through JOL. For the broader AI Agent ecosystem, this is a major step toward systematic Test-time Scaling. It shifts the paradigm from fragile prompt-based reflection to robust cognitive control protocols, providing a blueprint for how future agents can autonomously manage their computation budgets in high-stakes environments.