At the 2026 China AIGC Industry Summit, Assistant Professor Chao Huang from the University of Hong Kong (HKU) shared his profound insights on AI Agent architecture and infrastructure. Huang emphasized that in the AI-native era, the industry should pivot from forcing Agents to adapt to human-centric digital interfaces to "redesigning the digital world for Agents."
The first step in his team's practice was extreme simplification. In response to complex frameworks like OpenClaw, which launched with 430,000 lines of code, Huang's team open-sourced nanobot, an ultra-lightweight general-purpose Agent framework. Emphasizing minimalism, nanobot maintained a 100-day daily iteration streak, surpassing 200,000 downloads. It was recommended by DeepSeek as one of the top 15 global Agents and ranked fourth on the OpenRouter general Agent leaderboard. Next, the team aims to use nanobot to tackle long-horizon tasks that require cross-ecosystem execution and heterogeneous tool orchestration in dynamic production environments.
To transition Agents from "AI assistants" to "digital workforces," Huang's team proposed CLI-Anything. Huang argues that Command Line Interfaces (CLI), rather than Graphical User Interfaces (GUI), serve as the "native tongue" for Agents. Instead of forcing AI to navigate human GUIs with high visual processing overheads, CLI-Anything wraps professional tools like 3D modeling and multimedia editing into CLIs. This fundamentally reshapes the Computer Use paradigm, adapting the digital world to speak AI's language.
The team also investigated Agent self-evolution, contrasting "Internal" core optimization with "External" skill accumulation. Huang advocates for the External path, which scales the Agent's utility by expanding its tool ecosystem. To validate this, they executed an automated scientific experiment where 8 Agents successfully coordinated 8 H100 GPUs for distributed training. While proving highly efficient, the experiment also revealed that expanding Agent Swarms yields diminishing marginal returns beyond a critical threshold due to surging coordination overheads.
Ultimately, Huang stressed that Agent design boils down to the elegant ReAct (Reasoning-Action-Observation) loop. Robust Agents must learn from real failures, optimize token efficiency, and handle errors through graceful degradation, positioning CLI as a far more cost-effective and precise interface for Computer Use than GUI.
[AgentUpdate Depth Analysis] Professor Huang’s "CLI-Anything" paradigm addresses a critical bottleneck in the current "Computer Use" landscape. Dominant vision-based GUI approaches, such as Anthropic’s Computer Use API, suffer from high token consumption, latency, and instability due to UI layout changes. Shifting to CLI-based interfaces translates complex visual recognition into structured, semantic commands. This drastically reduces the computational footprint and guarantees deterministic execution. In the long run, this signals a massive paradigm shift in software development: SaaS applications will evolve from being purely "human-centric" to "Agent-first." Developing lightweight, machine-readable interfaces as a baseline infrastructure will be the catalyst that accelerates AI Agents from conversational copilots to fully autonomous digital workers.