Over the past decade, the ability of artificial intelligence to speed up complex processes has made it a key tool in engineering research. While many associate AI with cloud computing, its growth fundamentally requires expanding physical infrastructure, namely data centers.
Led by Dr. Qingsheng Wang and chemical engineering Ph.D. students Tylee Kareck and Chi-Yang Li, researchers from Texas A&M University are investigating an unexpected threat to data centers—increased fire risk. In a recent collaborative publication with George Washington University and the University of California, Berkeley, the team analyzed common causes of data center fires and identified strategies to reduce those risks. The research is published in the Journal of Loss Prevention in the Process Industries.
"Our work provides insights to assess fire risk so engineers can design safer and more resilient data centers," said Wang, a professor in the Artie McFerrin Department of Chemical Engineering. As data centers increase in number and become more powerful, they require more energy from battery systems and backup generators to ensure continuous operation—a fundamental need for data storage and real-time processing.
Researchers found that fires can start in a variety of ways, including battery failures, electrical faults such as arc flashes, equipment malfunction, and human error. Large quantities of batteries and high power density increase the risk of thermal runaway—a phenomenon where a battery undergoes an uncontrollable chemical reaction that produces significant amounts of heat, resulting in battery explosions and cascading failures.
The study also revealed that causes of data center fires tend to overlap. For instance, human error in battery installation can lead to arc flashes that ignite other components and escalate into a fire, while short circuits can trigger arc flashes independently, leading to massive operational disruptions.
[AgentUpdate Depth Analysis] As AI Agent ecosystems transition from simple chatbots to autonomous multi-agent networks and embodied intelligence, their dependency on persistent, low-latency compute becomes absolute. Unlike traditional stateless web applications, autonomous agents maintain complex reasoning states in real-time. Any micro-second disruption caused by physical hazards, like data center thermal runaway, could result in agent state loss, logic disconnection, or catastrophic autonomous decision failures. This study underscores that the physical layer remains the ultimate bottleneck. For AI Agents to achieve true autonomy, the industry must pivot towards building 'physically resilient' infrastructure, integrating advanced edge mitigation and self-healing power grids to ensure uninterrupted cognitive services.