第 24 期 | LangServe 初探：一键部署你的 LangChain 应用 (EN) — LangChain Masterclass: Zero to Production AI Applications

🎯 Learning Objectives

Welcome, future AI masters, to the fifth stop of the "LangChain Full-Stack Masterclass"! Today, we are going to tackle one of the most exciting and magical concepts in LangChain: Agents. By the end of this session, you will:

Thoroughly understand the core mechanism of Agents: From the LLM's "brain" to the tools' "hands and feet," master how Agents think, decide, and act.
Master the creation and integration of custom tools: Equip your intelligent support copilot with powerful skills like querying orders and modifying profiles, moving beyond mere text generation.
Build a dynamic, problem-solving Support Agent: Transform your copilot from a rigid, script-following bot into a flexible "mastermind" that adapts to user intent.
Identify and avoid common pitfalls in Agent design: Prevent your agent from falling into "hallucinations," infinite loops, or incorrect tool usage, ensuring stable and reliable operation.

📖 Theory & Concepts

Alright folks, buckle up. We are preparing to dive deep into the world of LangChain Agents.

In previous lessons, we learned how to build Chains to string LLM capabilities together. A Chain is like a pre-configured assembly line: every step, input, and output is clearly defined. This is highly effective for completing clear, fixed tasks.

But wait, is the real world actually like that? Are the questions your support copilot receives always so neat and predictable?

"I want to check the order status of the Bluetooth headphones I bought last week." "I filled in the wrong shipping address, can you help me change it?" "When will this new mouse be back in stock?"

Which of these questions can be solved with a single LLM call or a static Chain? None of them! They require:

Understanding user intent: Are they checking an order, changing an address, or asking about inventory?
Fetching external information: Order status, user addresses, and product inventory live in company databases or third-party APIs.
Executing external actions: Querying databases, calling APIs, updating records.

This is the exact limitation of Chains—they lack the ability to "think" and "act." They can only follow instructions step-by-step. Agents are here to solve exactly this problem!

The Core Philosophy of Agents: LLM "Thinking" and Tool "Acting"

The core idea of an Agent is: Let the Large Language Model (LLM) act as a "reasoning engine" that decides what to do next based on user input and available "tools." Simply put, the LLM plays the role of the brain, while the "tools" act as the hands and feet.

The Agent's workflow is a dynamic, iterative Reasoning-Action Loop:

Observation: The Agent receives user input or the result of the previous tool execution.
Thought: Based on the currently observed information, combined with its assigned "persona" and "toolbox," the LLM reasons and decides on the next action. It thinks: "Which tool should I use? What are the input parameters for this tool?"
Action: Once the LLM decides on an action, it calls one or more tools and provides the corresponding inputs.
Loop: The tool finishes executing and returns a result (Observation). The Agent enters the "Thought" phase again until it determines the task is complete or cannot proceed further.

The ReAct Paradigm: The Golden Rule of Agents

In the world of Agents, there is a highly renowned paradigm called ReAct (Reasoning and Acting), which perfectly illustrates the "Reasoning-Action" loop described above.

The core idea of ReAct is that the LLM shouldn't just generate the final answer. During the generation process, it must clearly demonstrate its "Thought" process, the "Action" it decides to take, and the "Observation" obtained after the action. This process repeats until the LLM believes it can provide the "Final Answer."

graph TD
    A[User Request: "Check shipping status for order XYZ"] --> B(Agent Executor)
    B --> C{LLM (Large Language Model)}
    C -- "Thought: User wants to check order status, I need an order query tool." --> D[Action: Call Order Query Tool]
    D --> E(Tool: query_order_status(order_id=XYZ))
    E -- "Observation: Order status: Shipped, expected delivery tomorrow." --> C
    C -- "Thought: I have the order status, I can reply to the user now." --> F[Action: Final Answer]
    F -- "Final Answer: Your order XYZ has shipped and is expected to arrive tomorrow." --> G(User)
    C -- "If more info needed or user asks follow-up" -- C

ReAct Flowchart Breakdown:

User Request: The Agent receives the original question from the user.
LLM (Thought): The LLM analyzes the user request, combining its system prompt and available tools to reason. It generates a "Thought" text indicating what it understands and what it plans to do next.
Action (Tool Calling): Based on the "Thought," the LLM decides which tool to call and constructs the input parameters. LangChain's Agent Executor parses this Action and actually executes the corresponding tool.
Tool (Execution): The registered tool is called and executes its internal logic (e.g., querying a database, calling an external API).
Observation: The tool finishes executing and returns a result. This result is passed back to the LLM as a new input.
LLM (Thought) & Action (Loop): The LLM thinks again. Based on the new Observation, it decides whether to continue calling other tools or if it's ready to give the final answer. This loop continues until the LLM generates a "Final Answer" Action.
Final Answer: The Agent finally returns this answer to the user.

This iterative ReAct pattern gives the Agent powerful dynamic decision-making capabilities, allowing it to think, try, and observe step-by-step like a human to solve complex problems.

Key Components of an Agent

A LangChain Agent primarily consists of the following parts:

LLM (Large Language Model): This is the "brain" of the Agent, responsible for all reasoning and decision-making.
Tools: These are the "hands and feet" of the Agent, encapsulating various external capabilities like querying databases, calling APIs, executing code, etc. Each tool has a description telling the LLM what it does and what its input parameters are.
Agent Executor: This is the "dispatch center" of the Agent. It drives the entire "Reasoning-Action" loop. It receives the Action from the LLM, executes the corresponding Tool, and returns the Tool's Observation back to the LLM.
Agent Type: LangChain provides several preset Agent types that define how the LLM is prompted and how its output is parsed to extract Thoughts, Actions, and Final Answers. The most common ones are zero-shot-react-description (based on the ReAct paradigm) and OpenAIFunctionsAgent (leveraging OpenAI's function calling capabilities).

In our intelligent support project, the Agent will be the core brain, coordinating various tools to respond to complex user needs.

💻 Practical Code Drill (Application in the Support Copilot Project)

Now, let's ground the Agent theory in our "Intelligent Support Knowledge Base" project.

Imagine that besides answering FAQs based on the knowledge base, users also want our support copilot to handle the following transactions:

Check product inventory: A user asks if a specific product is in stock.
Check order status: A user provides an order ID to track shipping information.

These require calling our backend system's APIs. Alright, let's build these "hands and feet" for our Agent!

We will use Python for the demonstration, as LangChain's Python ecosystem is more mature and widely used.

import os
from dotenv import load_dotenv

# Load environment variables to securely use API keys
load_dotenv()

from langchain_core.prompts import ChatPromptTemplate, MessagesPlaceholder
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnablePassthrough

# Import necessary modules
from langchain.agents import AgentExecutor, create_react_agent
from langchain_core.tools import tool
from langchain_openai import ChatOpenAI

# For demonstration purposes, we will mock some external APIs first
# In a real project, these would be functions calling your backend services or databases

@tool
def search_product_inventory(product_name: str) -> str:
    """
    Query the current inventory level of a product based on its name.
    Input parameters:
    - product_name (str): The name of the product to query.
    Returns:
    - str: A string containing the product's inventory information, e.g., "iPhone 15 Pro Max 当前库存 50 件。"
    """
    print(f"\n--- 调用工具: search_product_inventory(product_name='{product_name}') ---")
    # Mock database query or API call
    inventory_data = {
        "iPhone 15 Pro Max": "50",
        "华为 MateBook X Pro": "20",
        "小米手环 8 Pro": "120",
        "索尼 WH-1000XM5 耳机": "35",
        "戴森吹风机": "缺货",
        "Apple Watch Ultra 2": "15"
    }
    result = inventory_data.get(product_name, "抱歉，未找到该商品或无法获取库存信息。")
    if result == "缺货":
        return f"抱歉，{product_name} 目前缺货。"
    elif "未找到" in result:
        return result
    else:
        return f"{product_name} 当前库存 {result} 件。"

@tool
def get_order_status(order_id: str) -> str:
    """
    Query the current status and shipping information of an order based on the order ID.
    Input parameters:
    - order_id (str): The order number to query.
    Returns:
    - str: A string containing the order status and shipping information.
    """
    print(f"\n--- 调用工具: get_order_status(order_id='{order_id}') ---")
    # Mock external logistics API call
    order_data = {
        "20231026001": "您的订单已于10月26日发货，快递单号SF123456789，预计10月28日送达。",
        "20231025002": "您的订单正在打包中，预计今天下午发出。",
        "20231024003": "抱歉，该订单已取消，请联系客服处理。",
        "20231023004": "您的订单已于10月23日签收，感谢您的购买！",
    }
    return order_data.get(order_id, "抱歉，未找到该订单号或订单信息。")

# Put tools into a list
tools = [search_product_inventory, get_order_status]

# Initialize LLM
# It is recommended to use models that support function calling, such as gpt-3.5-turbo or gpt-4
# Ensure your OPENAI_API_KEY is set in your environment variables
llm = ChatOpenAI(model="gpt-3.5-turbo-1106", temperature=0) # temperature=0 makes the model more "rational"

# 1. Define the Agent's Prompt
# LangChain provides a create_react_agent function that helps build a prompt conforming to the ReAct paradigm
# But we can also customize it. Here we use the structure recommended by create_react_agent, modified slightly for a support scenario
prompt_template = ChatPromptTemplate.from_messages(
    [
        ("system", "You are an intelligent customer support assistant. Your task is to efficiently and accurately answer user questions and process their requests. You can use the provided tools to fetch real-time information or execute actions. If an issue cannot be resolved using tools, please try to provide a friendly and helpful response."),
        MessagesPlaceholder("chat_history"), # Reserved for future chat history, not covered in this session
        ("human", "{input}"),
        MessagesPlaceholder("agent_scratchpad"), # This is the "scratchpad" for the Agent's internal thoughts and actions
    ]
)

# 2. Create the Agent
# The create_react_agent function combines the LLM, Tools, and Prompt to create an Agent Runnable
agent = create_react_agent(llm, tools, prompt_template)

# 3. Create the Agent Executor
# The Agent Executor is the engine that actually runs the Agent; it drives the Thought-Action-Observation loop
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True, handle_parsing_errors=True)

print("--- 智能客服 Agent 启动！请开始提问（输入 'exit' 退出）---")

# Mock interaction with the intelligent support copilot
while True:
    user_query = input("\n用户：")
    if user_query.lower() == 'exit':
        print("客服助手：再见！")
        break

    try:
        # Call the Agent Executor to process the user query
        # chat_history is temporarily empty; memory management will be added in future lessons
        response = agent_executor.invoke({"input": user_query, "chat_history": []})
        print(f"客服助手：{response['output']}")
    except Exception as e:
        print(f"客服助手：抱歉，处理您的请求时遇到问题：{e}")
        print("请尝试换一种方式提问，或稍后再试。")

Code Breakdown:

Tool Definition (@tool decorator): We defined the search_product_inventory and get_order_status functions and converted them into Agent-recognizable tools using LangChain's @tool decorator.
- Key Point: Every tool function must have a clear docstring. The LLM uses this docstring to understand the tool's function, input parameters, and when to use it. This is the foundation of correct Agent decision-making! Parameter type hints are also crucial.
- Mocking External Systems: In real projects, these functions would call your database or microservice APIs. Here, we used Python dictionaries to mock the data.
LLM Initialization: We use ChatOpenAI as the LLM. It's recommended to use models that support function calling (like gpt-3.5-turbo-1106 or gpt-4), as they perform much better at tool calling.
Prompt Definition (ChatPromptTemplate):
- We built a ChatPromptTemplate containing system instructions (system) and user input (human).
- MessagesPlaceholder("agent_scratchpad") is a highly critical placeholder. The Agent Executor populates this with the history of Thoughts, Actions, and Observations, forming the complete ReAct loop.
- MessagesPlaceholder("chat_history") is reserved for future memory management; we leave it empty for now.
Creating the Agent (create_react_agent): LangChain provides this convenient function to assemble the LLM, tools, and Prompt into an Agent "Runnable" based on the ReAct paradigm.
Creating the Agent Executor (AgentExecutor): This is the runtime engine for the Agent.
- agent: The Agent Runnable we just created.
- tools: The list of tools available to the Agent.
- verbose=True: This is extremely important! During development and debugging, always set this to True. It prints out the detailed internal Thought, Action, and Observation processes, letting you clearly see how the LLM thinks and calls tools.
- handle_parsing_errors=True: Allows the Agent to retry or handle situations where it fails to parse the LLM's output.

Execution Demo:

When you run the code above and input the following questions, you will see the Agent's thought process:

用户：苹果手机现在还有货吗？ (Are Apple phones still in stock?)

You will see the LLM think: "The user wants to check inventory. I should use the search_product_inventory tool with the parameter product_name='iPhone 15 Pro Max'." The tool is then called and returns the inventory information.

用户：查一下20231026001这个订单的物流 (Check the shipping for order 20231026001)

The LLM thinks: "The user wants to check an order. I should use the get_order_status tool with the parameter order_id='20231026001'." The tool is then called and returns the shipping information.

用户：你好，请问你们公司是做什么的？ (Hello, what does your company do?)

The LLM will realize this question doesn't require any tools and will answer it directly.

Through verbose=True, you will witness firsthand how the Agent takes a question, thinks about it, selects the appropriate tool, executes it, thinks again based on the result, and finally provides an answer. It's like giving the LLM "eyes" and "hands," truly bringing it to life!

Pitfalls and How to Avoid Them

Congratulations, you've mastered basic Agent construction! But don't celebrate just yet. While powerful, Agents aren't omnipotent. They can be like "unruly children" requiring careful tuning. Here are some common pitfalls and guidelines to avoid them:

1. Vague or Inaccurate Tool Descriptions

Pitfall: The LLM relies entirely on the docstring to understand a tool's function and when to use it. If the docstring is vague, ambiguous, or has inaccurate parameter descriptions, the LLM will likely misuse the tool or ignore it entirely. For example, with a tool like def search(query: str), the LLM won't know if query refers to a product name, an order ID, or something else.
How to Avoid:
- Write tool descriptions like API docs: Detail what the tool does, what problems it solves, its input parameters (types, examples), and what the return value means.
- Define boundaries: State what the tool cannot do or under what circumstances it shouldn't be used.
- Use concrete examples: Include usage scenarios in the description to help the LLM understand better.
- Clear parameter naming: product_name is much clearer than just name.

2. Insufficient Prompt Engineering Leading to Persona Collapse

Pitfall: If the System Prompt is poorly written, the Agent might forget its role, its goals, or panic when tools can't solve a problem. It might "hallucinate" non-existent tools or use tools inappropriately.
How to Avoid:
- Clarify role and goals: Clearly define the Agent's identity ("You are an intelligent customer support assistant"), its main task ("Efficiently and accurately answer user questions"), and its priorities ("Prioritize using tools to fetch information").
- Instruct on unknown situations: Tell the Agent how to react when tools fail, e.g., "If an issue cannot be resolved using tools, please try to provide a friendly and helpful response."
- Guide tool usage: Use examples or instructions to gently guide the LLM on which tools to consider in specific scenarios.

3. Improper Tool Granularity

Pitfall:
- Too coarse: A tool does too much, making it hard for the LLM to control precisely. For example, with a handle_customer_request(request_type: str, details: str) tool, the LLM will struggle to construct complex requests correctly within the details string.
- Too fine: Too many micro-tools overwhelm the LLM during decision-making, increasing cognitive load and cost.
How to Avoid:
- Single Responsibility Principle: Each tool should do one thing and do it well. For instance, search_product_inventory and get_order_status are two separate tools.
- Separate queries from actions: Try to keep read-only tools separate from write/modify tools.
- Align with user intent: Tool design should map to the high-level intents users are likely to express.

4. Infinite Loops or Failure to Conclude

Pitfall: The Agent might bounce back and forth in the Thought-Action-Observation loop without converging on a final answer. This usually happens if the LLM misinterprets tool results or if the tool returns insufficient information for the LLM to make the next decision.
How to Avoid:
- Set max_iterations: Configure the max_iterations parameter in AgentExecutor (e.g., max_iterations=10) to prevent infinite loops. Once reached, the Agent stops and returns an error or partial result.
- Clear tool outputs: Ensure tool outputs are clear, concise, and unambiguous, directly aiding the LLM's next decision.
- Error handling: Tools should have robust internal error handling. Even if an external API fails, return a meaningful error string to the LLM rather than throwing a raw exception or returning an empty string.
- Prompt guidance for conclusion: Explicitly tell the LLM in the prompt when it should generate the Final Answer.

5. Performance and Cost Considerations

Pitfall: Every Thought is an LLM call, and every Action executes a tool (which might be an external API call). Excessive iterations drastically increase latency and cost.
How to Avoid:
- Optimize Prompts: Streamline prompts to reduce unnecessary LLM thinking.
- Efficient tools: Ensure tools execute quickly to minimize external API latency.
- Caching: Implement caching mechanisms for frequently queried, rarely changing data.
- Choose the right LLM: For steps that don't require maximum reasoning capabilities, consider using lighter, cheaper models.

Tuning an Agent is an iterative process. You need to constantly experiment, observe the verbose=True outputs, and tweak prompts and tool descriptions. Treat it like an apprentice: give it plenty of feedback, and it will get smarter and smarter!

📝 Summary of this Session

Everyone, today we completed a deep dive into the world of Agents! We not only understood the core principles of LangChain Agents—combining LLM "thinking" with tool "acting" to achieve dynamic decision-making—but we also built "hands and feet" for our support copilot, enabling it to check inventory and order status.

Introducing Agents elevates our support copilot from a passive Q&A machine to an "intelligent assistant" capable of proactively solving problems and executing tasks. It is no longer confined to static knowledge base information; it can interact with the outside world, fetch real-time data, and even execute complex operations. This is a crucial step toward building truly production-grade AI applications!

Of course, we also saw the side of Agents that requires fine-tuning. Tool descriptions, prompt design, granularity, and considerations for performance and stability are all challenges we must face and overcome in real-world projects.

Next Time: Although our Agent is powerful, it currently has the memory of a goldfish—it doesn't remember anything. If a user chats with it for a few turns, it forgets what was said earlier. In the next session, we will dive into LangChain's Memory Management. We'll teach you how to give your support copilot "long-term memory" to achieve true multi-turn conversations and take the user experience to the next level! Stay tuned!