Issue 16 | Exploring Multi-Agent Architecture: Flat vs Hierarchical
Should an AI Content Agency adopt a flat discussion system or a top-down pyramid reporting structure?
Welcome back to the "LangGraph Multi-Agent Masterclass". I am your instructor.
Over the past 15 episodes, our "AI Content Agency" has grown from scratch and now boasts four capable key players: Planner, Researcher, Writer, and Editor. Previously, we had them complete viral articles one after another through a simple Linear Flow relay.
However, class, real-world business is rarely this idealized. Yesterday, the agency received a massive order: "Please write an in-depth industry analysis on Apple's newly released Vision Pro, simultaneously output a Xiaohongshu (RED) promotional copy, and provide corresponding Twitter promotional threads."
If we still use the previous "relay race" model, the Researcher finishes gathering data and throws it to the Writer, and the Writer is instantly confused: "Should I write the in-depth analysis first, or the Xiaohongshu post?" The entire workflow completely stalls.
When business complexity rises exponentially, the linear flow is doomed to fail, and refactoring the multi-agent architecture becomes imminent. Today, we will deeply explore and implement the two core schools of multi-agent architecture: Flat and Hierarchical. We will introduce a true "manager" mechanism to the agency, transforming your Agent team from a "ragtag crew" into a "regular army".
Pay attention and focus, it's about to get brain-burning!
🎯 Learning Objectives for this Episode
- Cognitive Upgrade (Theory): Deeply understand the underlying logic and applicable scenarios of Flat (flat network) and Hierarchical (hierarchical pyramid) architectures.
- Architecture Refactoring (Practice): Master the standard design pattern of introducing a
Supervisor(manager/router) node in LangGraph. - Practical Implementation (Application): Use Python + LangGraph to refactor our Content Agency, implementing a complex workflow coordinated and dispatched by the Planner, where everyone performs their own duties.
- State Management (Depth): Solve the Context Bloat problem caused by frequent multi-agent interactions.
📖 Principle Analysis: Ragtag Crew vs. Modern Enterprise
In a Multi-Agent System (MAS), the collaborative topology between Agents determines the system's upper limit. Let's look at the two most classic architectures.
1. Flat Architecture (Peer-to-Peer Architecture)
Imagine a startup: the boss, product manager, and developers sit together. When a problem arises, everyone discusses it all at once in a group chat.
- Mechanism: Agents can communicate directly with each other, or speak freely through a shared Blackboard.
- Pros: Extremely low communication costs and high flexibility. Suitable for brainstorming, murder mystery games, debates, and other scenarios requiring divergent thinking.
- Cons: Loss of control. When tasks are complex, it is extremely easy to fall into an infinite loop of "bickering" (e.g., the Writer feels the data is insufficient and sends it back to the Researcher; the Researcher feels the Writer doesn't understand the technology and sends it back again).
2. Hierarchical Architecture (Supervisor Architecture)
As the company grows, a "bureaucracy" must be introduced. We need a foreman (Supervisor).
- Mechanism: Establish a centralized
Supervisor(which is thePlannerin our project). All Worker Agents (Researcher, Writer, Editor) cannot communicate directly with each other. After a Worker completes a task, they must report the results to the Supervisor, who decides whether the next step is to hand it over to another Worker or to finish the task. - Pros: Orderly and logically rigorous, highly suitable for goal-oriented complex task decomposition.
- Cons: The Supervisor becomes a performance Bottleneck. If the manager is a "fool" (poorly written Prompt or weak LLM capabilities), the entire team is paralyzed.
For our AI Content Agency, facing multi-channel, multi-format content generation demands, Hierarchical is the only antidote.
The diagram below intuitively illustrates the architectural direction we are refactoring today:
graph TD
subgraph "❌ Past Pain Points: Flat/Linear Architecture (Ragtag Crew)"
R1[Researcher] -->|Data| W1[Writer]
W1 -->|Draft| E1[Editor]
E1 -->|Send back for revision?| W1
style R1 fill:#ffcccc,stroke:#333,stroke-width:2px
style W1 fill:#ffcccc,stroke:#333,stroke-width:2px
style E1 fill:#ffcccc,stroke:#333,stroke-width:2px
end
subgraph "✅ Today's Refactoring: Hierarchical Architecture (Regular Army)"
User((Client Request)) --> S{Planner\n(Supervisor)}
S -->|1. Assign Research Task| R2[Researcher]
R2 -.->|2. Submit Report| S
S -->|3. Assign Writing Task| W2[Writer]
W2 -.->|4. Submit Draft| S
S -->|5. Assign Editing Task| E2[Editor]
E2 -.->|6. Submit Final Draft| S
S -->|7. All Tasks Completed| FINISH(((Deliver Output)))
style S fill:#ff9900,stroke:#333,stroke-width:4px,color:#fff
style R2 fill:#cce5ff,stroke:#333,stroke-width:2px
style W2 fill:#cce5ff,stroke:#333,stroke-width:2px
style E2 fill:#cce5ff,stroke:#333,stroke-width:2px
style FINISH fill:#ccffcc,stroke:#333,stroke-width:2px
endMake sense? In the new architecture, the Planner becomes the Routing Hub of the entire Graph. It doesn't do the dirty work; it only does two things: Think and Delegate.
💻 Practical Code Drill
Without further ado, let's jump straight into the code. We will use LangGraph's StateGraph to build this hierarchical network. To ensure stable output of routing decisions, we will arm our Supervisor with OpenAI's Structured Output feature.
1. Environment Setup and State Definition
First, we need to define the "shared ledger" of the entire agency—AgentState.
import operator
from typing import Annotated, Sequence, TypedDict, List
from langchain_core.messages import BaseMessage, HumanMessage, AIMessage
from langchain_openai import ChatOpenAI
from pydantic import BaseModel, Field
from langgraph.graph import StateGraph, START, END
# 1. Define the system state (The State of the Agency)
# Inherits from TypedDict, messages use operator.add to append messages
class AgencyState(TypedDict):
# Records all historical conversations and work results
messages: Annotated[Sequence[BaseMessage], operator.add]
# Records who should take over the next task
next_agent: str
2. Building the Core Brain: Supervisor (Planner)
This is the soul code of this episode. We will force the LLM to output a specific JSON structure, telling LangGraph where to go next.
# 2. Define the routing structure of the Supervisor (Structured Output)
# Our agency currently has three workers
MEMBERS = ["Researcher", "Writer", "Editor"]
class RouterDecision(BaseModel):
"""Supervisor decides who should act next."""
# Force the LLM to only choose from these options, or output FINISH
next_agent: str = Field(
description="The next agent to act. Choose from 'Researcher', 'Writer', 'Editor', or 'FINISH' if the whole project is done."
)
reasoning: str = Field(
description="Brief explanation of why this agent was chosen."
)
# Initialize the smart brain (gpt-4o is recommended, the manager shouldn't be too dumb)
llm = ChatOpenAI(model="gpt-4o", temperature=0)
# Bind the Pydantic model using with_structured_output
supervisor_chain = llm.with_structured_output(RouterDecision)
# Write the logic for the Supervisor node
def supervisor_node(state: AgencyState):
"""
Supervisor Node: Analyzes the current state and routes to the next agent.
Supervisor Node: Analyzes the current progress and dispatches to the next Agent.
"""
system_prompt = (
"You are the Chief Planner of an elite AI Content Agency.\n"
"Your team members are: {members}.\n"
"Your job is to read the conversation/work history and decide who should act next.\n"
"Rule 1: Always start with 'Researcher' to gather facts.\n"
"Rule 2: Pass to 'Writer' to draft the content based on research.\n"
"Rule 3: Pass to 'Editor' to review and refine the draft.\n"
"Rule 4: If the Editor has approved the final content and all requirements are met, output 'FINISH'.\n"
"DO NOT do the work yourself. Just route it."
).format(members=", ".join(MEMBERS))
messages = [{"role": "system", "content": system_prompt}] + state["messages"]
# Call the LLM to make a decision
print("🧠 [Planner] is thinking...")
decision = supervisor_chain.invoke(messages)
print(f"🎯 [Planner] Decision: Next up is -> {decision.next_agent}. Reason: {decision.reasoning}")
# Key point: Return the updated state, especially the next_agent field
return {"next_agent": decision.next_agent}
3. Defining Worker Nodes
For clarity of demonstration, we use simple Prompts to simulate the work of these three experts. In real business code, you can mount their respective Tools (e.g., mount Tavily search for the Researcher, and a Markdown formatting tool for the Writer).
# 3. Define Worker Nodes
# Helper function: Wrap the Worker's output into an AIMessage and indicate its identity
def worker_node_factory(role_name: str, system_instruction: str):
def node(state: AgencyState):
print(f"🛠️ [{role_name}] is working on the task...")
worker_llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0.7)
messages = [{"role": "system", "content": system_instruction}] + state["messages"]
response = worker_llm.invoke(messages)
# Must add the name as a prefix to the message so the Supervisor knows who did it
final_message = AIMessage(
content=response.content,
name=role_name # Mark the source of the message
)
return {"messages": [final_message]}
return node
# Instantiate the three workers
researcher_node = worker_node_factory(
"Researcher",
"You are a meticulous Researcher. Read the request, provide detailed bullet points of facts and data. Do not write the final article."
)
writer_node = worker_node_factory(
"Writer",
"You are an expert Writer. Take the research provided by the Researcher and draft a compelling piece of content based on the user's request."
)
editor_node = worker_node_factory(
"Editor",
"You are a strict Editor. Review the Writer's draft. Fix any typos, improve the tone, and output the FINAL polished version. Explicitly state 'FINAL VERSION APPROVED' at the end."
)
4. Assembling the Graph Network: Building the Pyramid
Now, we need to connect the manager and the workers using LangGraph Edges. This is the core manifestation of the Hierarchical architecture.
# 4. Build LangGraph
workflow = StateGraph(AgencyState)
# Add all nodes
workflow.add_node("Supervisor", supervisor_node)
workflow.add_node("Researcher", researcher_node)
workflow.add_node("Writer", writer_node)
workflow.add_node("Editor", editor_node)
# After all Workers finish their work, they must unconditionally report to the Supervisor
for member in MEMBERS:
workflow.add_edge(member, "Supervisor")
# Define conditional routing logic
def router(state: AgencyState):
# Determine the direction based on the next_agent set by the Supervisor
next_node = state["next_agent"]
if next_node == "FINISH":
return END
return next_node
# The next step for the Supervisor is a Conditional Edge
workflow.add_conditional_edges(
"Supervisor", # Starting point
router, # Routing function
{
"Researcher": "Researcher",
"Writer": "Writer",
"Editor": "Editor",
END: END
}
)
# Set the entry point of the graph: When a task comes in, go to the Planner (Supervisor) first
workflow.add_edge(START, "Supervisor")
# Compile the graph
agency_app = workflow.compile()
5. Simulating the Demo Run
Let's see how this "regular army" handles complex tasks.
# 5. Run the test
if __name__ == "__main__":
task_prompt = "Write a short, engaging tweet about the latest breakthrough in Quantum Computing. Make it accessible to the public."
print(f"🚀 User Request: {task_prompt}\n")
print("-" * 50)
initial_state = {
"messages": [HumanMessage(content=task_prompt)],
"next_agent": ""
}
# Set the recursion limit to prevent infinite loops
config = {"recursion_limit": 15}
for chunk in agency_app.stream(initial_state, config=config):
# Print the name of the currently completed node
if "__end__" not in chunk:
node_name = list(chunk.keys())[0]
print(f"✅ [{node_name}] finished its turn.\n")
print("-" * 50)
print("🎉 Project Completed Successfully!")
Expected Terminal Output Simulation:
🚀 User Request: Write a short, engaging tweet about the latest breakthrough in Quantum Computing...
🧠 [Planner] is thinking... 🎯 [Planner] Decision: Next up is -> Researcher. Reason: Need to gather the latest facts on quantum computing breakthroughs first. ✅ [Supervisor] finished its turn.
🛠️ [Researcher] is working on the task... ✅ [Researcher] finished its turn.
🧠 [Planner] is thinking... 🎯 [Planner] Decision: Next up is -> Writer. Reason: Research is complete, now we need to draft the tweet. ✅ [Supervisor] finished its turn.
... (Until FINISH) 🎉 Project Completed Successfully!
💣 Pitfalls and Avoidance Guide (Troubleshooting Experience from a High-Level Perspective)
As a veteran with 10 years of experience, I must tell you that getting the above Demo running is only the first step. In a real production environment, the Hierarchical architecture has several fatal pitfalls; stepping into them means eternal doom.
Pitfall 1: Supervisor Falls into an "Infinite Loop"
Symptom: The Editor thinks the article is not good enough and sends it back to the Writer; the Writer revises it and sends it back to the Editor, but the Editor is still unsatisfied. The two pass the buck back and forth through the Supervisor, and your API Token balance is instantly wiped out.
Avoidance:
- Hard Block: You must set a
recursion_limit(like 15 in the code) duringworkflow.compile(). - Prompt Constraint: Explicitly state in the Supervisor's Prompt: "A maximum of 2 revisions is allowed; exceeding this limit forces a FINISH."
- State Injection: Add a
revision_count: intfield inAgentState, increment it by 1 each time it is sent back, and route directly to END when the threshold is reached.
Pitfall 2: Context Bloat
Symptom: Because all Workers have to report their results to the Supervisor, the messages list will get longer and longer. By the 10th round of interaction, every LLM call will carry tens of thousands of tokens of nonsense.
Avoidance:
Do not mindlessly use operator.add to accumulate all messages. Introduce a State Summarization mechanism. Alternatively, distinguish between a scratchpad (for drafting, cleared regularly) and final_artifacts (final results, keeping only the latest version) in the State.
Pitfall 3: Over-engineering
Symptom: The client simply asks to "translate this sentence into English," but you trigger the complete Planner -> Researcher -> Writer -> Editor workflow, taking 30 seconds and costing $0.10.
Avoidance:
There is no absolute good or bad architecture, only suitable or unsuitable ones. If 80% of your business consists of simple tasks, please add a Triage Node in front of the Supervisor. If it's a simple task, directly call a lightweight model for a single output; for complex tasks, then enter the hierarchical network.
📝 Episode Summary
Today, we completed an epic refactoring.
We explored the flexibility and chaos of the Flat architecture, as well as the rigor and controllability of the Hierarchical architecture. By introducing LangGraph's add_conditional_edges and the structured output capabilities of large models, we successfully built a Supervisor (Planner) with independent thinking and dispatching abilities.
Your AI Content Agency is no longer a few stragglers working on their own, but a modern AI production assembly line commanded by a brain, where everyone performs their own duties.
A question for you to ponder:
In today's code, the Supervisor can only dispatch to one Agent at a time. If the client requests to "search Baidu and Google simultaneously," can we have two Researchers work in Parallel, and then aggregate the results for the Writer?
This involves LangGraph's advanced Fan-out / Fan-in mechanisms. Don't worry, we will see you in the next episode: "Episode 17 | Parallel Processing and Map-Reduce: Giving Your Team the Power of Cloning"!
Class dismissed! Remember to run the code, class, don't just read without practicing!