第 04 期 | 记忆系统：赋予客服机器人的上下文记忆 (EN) — LangChain Masterclass: Zero to Production AI Applications

🎯 Learning Objectives for This Session

Hey there, future AI architects! I'm your old friend—a ten-year veteran in the AI space and your most enthusiastic mentor. Today, we officially kick off our hardcore LangChain journey. Don't worry, we'll start from the very basics and guide you step-by-step to build your first production-grade AI application: our "Intelligent Support Knowledge Base."

By the end of this session, you will achieve the following goals:

Understand LangChain's Core Value: Discover why LangChain is the "Swiss Army knife" for building complex LLM applications and what pain points it solves.
Master Essential LangChain Components: Get familiar with LLM and PromptTemplate—the "mouthpiece" and "script" for conversing with large language models.
Build a Support Copilot Prototype: Get hands-on and build a support agent that receives user queries and generates initial responses, experiencing the thrill of going from zero to one.
Cultivate an AI Developer Mindset: Learn how to materialize abstract AI capabilities into specific features for our support project, laying the groundwork for more complex functionalities later.

📖 Under the Hood: Core Concepts

Alright, enough talk—let's dive right in.

Imagine you're developing an intelligent support copilot. A user asks, "What's my order status?" or "How do I request a return?" You can't just throw these questions directly at an LLM and expect an accurate answer that complies with your company policies, right? While LLMs are powerful, they lack your specific business context and don't inherently know they are supposed to act as a customer service agent.

This is where LangChain shines. It is not an LLM itself; rather, it's an orchestration framework—a "conductor" that helps you organize LLMs, data sources, external tools, and other "instruments" to play a beautiful symphony together.

In our "Intelligent Support Knowledge Base" project, LangChain's core value lies in:

Structured Input: User queries can be wildly unpredictable. LangChain helps us "translate" these queries into instructions that the LLM can understand and process effectively.
Unified Interface: Whether it's OpenAI's GPT series, Google's Gemini, or a locally deployed Llama 2, LangChain provides a unified interface to call them. This allows you to swap models on the fly without refactoring your code.
Modular Design: It breaks down the development process of LLM applications into independent, reusable components (like LLM, PromptTemplate, Chain, Memory, Tool, etc.). Like Lego bricks, you can freely combine them to build infinite possibilities.

In this session, we'll focus on two of the most fundamental yet crucial components:

LLM (Large Language Model): This is the cornerstone of your interaction with large models. It encapsulates the calling logic for various LLMs, freeing you from worrying about underlying API request details. You just tell it which model to use and pass in your input.
PromptTemplate: This is where you "write the script" for the LLM. It allows you to define a template with placeholders and dynamically inject content to generate the final prompt sent to the model. This is critical for ensuring the LLM adopts a specific persona and follows a specific output format. In our support scenario, we want the copilot to answer questions in a professional and friendly tone.

Now, let's look at a simplified Mermaid diagram to understand how these two components collaborate in our intelligent support project to complete a basic Q&A flow.

graph TD
    A[User Input: "I have a question about my order"] --> B{PromptTemplate: Set Support Persona & Query Format}
    B --> C[Generate Full Prompt: "You are a professional support agent. Please help the user based on their query. User query: 'I have a question about my order'"]
    C --> D[LLM: Call Large Model 
 e.g., GPT-3.5-turbo]
    D --> E[LLM Output: "Hello! What specific information would you like to know about your order?"]
    E --> F[Support Copilot Replies to User]

Workflow Breakdown:

User Input: The user asks a question in the support interface.
PromptTemplate Processing: Our PromptTemplate pre-defines the copilot's "persona" (e.g., "You are a professional intelligent support agent. Your duty is to answer user queries about products and services and provide solutions.") along with output format requirements. It then injects the user's specific question into this preset template.
Generate Full Prompt: After processing by the PromptTemplate, a complete, structured prompt containing the persona setup and the specific query is born.
LLM Invocation: This complete prompt is passed to the LLM component. The LLM handles communication with the actual large model (e.g., OpenAI's gpt-3.5-turbo), sending the request and receiving the response.
LLM Output: The large model generates a corresponding response based on the prompt it received.
Copilot Reply: The intelligent support copilot displays the LLM's response to the user.

See that? LangChain is like a skilled chef. It takes the user's disorganized "ingredients" (queries), follows the "recipe" (PromptTemplate), puts them into the "oven" (LLM), and finally serves up a delicious "dish" (the answer). This is vastly superior to just throwing raw meat into the oven!

💻 Hands-on Coding (Application in the Support Project)

Alright, theory is great, but nothing beats getting your hands dirty with code. Let's build the "Hello World" version of our support copilot. We'll demonstrate using Python, as it's the dominant language in the AI space, but I'll also provide a TypeScript example later so you can experience LangChain's cross-language appeal.

Environment Setup

First, you need to install LangChain and the client for your chosen LLM provider. We'll use OpenAI as our example here.

# Python environment
pip install langchain-openai

# TypeScript / JavaScript environment
npm install langchain @langchain/openai

Don't forget to set your OpenAI API Key. The safest way is to set it as an environment variable.

# Execute in your terminal (macOS/Linux)
export OPENAI_API_KEY="sk-YOUR_OPENAI_API_KEY"

# Execute in your terminal (Windows PowerShell)
$env:OPENAI_API_KEY="sk-YOUR_OPENAI_API_KEY"

Python Implementation

We will build a simple SimpleSupportAgent class that receives user queries and generates responses using LangChain's ChatOpenAI and PromptTemplate.

import os
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate, HumanMessagePromptTemplate, SystemMessagePromptTemplate
from langchain_core.messages import HumanMessage, SystemMessage

class SimpleSupportAgent:
    """
    A basic intelligent support agent used to demonstrate LangChain's LLM and PromptTemplate components.
    It can receive user queries and generate responses based on a preset support persona.
    """
    def __init__(self, model_name: str = "gpt-3.5-turbo-0125", temperature: float = 0.7):
        """
        Initialize the support agent.
        :param model_name: The name of the OpenAI model to use.
        :param temperature: The creativity level of the model (between 0-1, higher means more creative).
        """
        # Check if API Key is set
        if not os.getenv("OPENAI_API_KEY"):
            raise ValueError("OPENAI_API_KEY 环境变量未设置。请先设置您的OpenAI API Key。")

        # Initialize ChatOpenAI model instance
        # This is the "brain" of our support copilot, responsible for understanding and generating text
        self.llm = ChatOpenAI(model_name=model_name, temperature=temperature)

        # Define the system-level prompt template for the support copilot
        # This is like giving the copilot a "script", telling it what role to play
        self.system_prompt = SystemMessagePromptTemplate.from_template(
            "你是一个专业的智能客服助手，你的职责是解答用户关于产品和服务的疑问，并提供解决方案。 "
            "你的回答应该友好、清晰、准确，并尽量简洁。如果问题超出你的知识范围，请礼貌地告知用户。"
        )

        # Define the prompt template for user messages
        # This is the user input part, which will be dynamically injected into the full Prompt
        self.human_prompt = HumanMessagePromptTemplate.from_template("{user_question}")

        # Combine the system prompt and human prompt into a complete chat prompt template
        # This is the complete "script" we show to the LLM, containing the persona setup and specific user query
        self.chat_prompt = ChatPromptTemplate.from_messages([
            self.system_prompt,
            self.human_prompt
        ])

    def get_response(self, user_question: str) -> str:
        """
        Generate a support response based on the user's query.
        :param user_question: The question asked by the user.
        :return: The support copilot's response.
        """
        print(f"\n--- 用户提问 ---\n{user_question}")

        # Format the user query using the chat_prompt template to generate the final Prompt
        # This step injects the user query into our preset "script"
        formatted_prompt = self.chat_prompt.format_messages(user_question=user_question)
        print(f"\n--- 发送给大模型的Prompt (格式化后) ---\n{formatted_prompt}")

        # Call the LLM to get the response
        # The LLM generates an answer based on the formatted Prompt
        response = self.llm.invoke(formatted_prompt)
        
        # Extract the response content
        ai_response_content = response.content
        print(f"\n--- 智能客服回复 ---\n{ai_response_content}")
        return ai_response_content

# --- Simulate running our support copilot ---
if __name__ == "__main__":
    try:
        # Instantiate our support agent
        # You can try different models or temperatures
        agent = SimpleSupportAgent(model_name="gpt-3.5-turbo", temperature=0.5)

        # Simulate user queries
        agent.get_response("我的订单号是 123456789，请问什么时候能发货？")
        agent.get_response("你们公司最新的产品是什么？有什么特点？")
        agent.get_response("如何申请退货？需要提供哪些资料？")
        agent.get_response("请给我讲一个笑话。") # Test a scenario outside the knowledge scope
    except ValueError as e:
        print(f"错误：{e}")
        print("请确保已设置 OPENAI_API_KEY 环境变量。")
    except Exception as e:
        print(f"发生未知错误：{e}")

Code Walkthrough:

ChatOpenAI: We instantiated a ChatOpenAI object, specifying the model name (gpt-3.5-turbo-0125 is a newer version and recommended) and temperature. The temperature determines the "randomness" or "creativity" of the model's response. For a support scenario, we generally want it to be stable and accurate, so setting it to 0.5 or lower is ideal.
SystemMessagePromptTemplate: This is the key to defining the copilot's "personality." Through a template, we tell the LLM that it is a "professional intelligent support agent" and define its duties and response style. It's like putting a uniform on the agent and clarifying its job description.
HumanMessagePromptTemplate: This is the placeholder for user input. {user_question} will be replaced by the user's actual query during invocation.
ChatPromptTemplate.from_messages: This combines the system prompt and the human prompt to form a complete conversational flow. LangChain automatically handles the formatting of these messages to meet the LLM API's requirements.
chat_prompt.format_messages: Inside the get_response method, we use this to inject the user's query into the template, generating the final list of Message objects to be sent to the LLM.
self.llm.invoke(formatted_prompt): This is where the actual LLM call happens. The invoke method takes the formatted prompt, sends it to the underlying model behind the ChatOpenAI instance, and returns the model's response.
response.content: Extracts the actual text content we need from the LLM's response object.

Run this code, and you'll see how the support copilot provides appropriate responses to different questions based on the persona you set. Even for questions outside its knowledge scope, it responds politely—this is exactly where the SystemMessagePromptTemplate does its magic!

TypeScript / JavaScript Implementation (Optional)

If you are a frontend developer or prefer TypeScript, here is the equivalent implementation:

import { ChatOpenAI } from "@langchain/openai";
import { ChatPromptTemplate, SystemMessagePromptTemplate, HumanMessagePromptTemplate } from "@langchain/core/prompts";
import { BaseMessage } from "@langchain/core/messages";

// Ensure you have set the OPENAI_API_KEY environment variable
// Example: process.env.OPENAI_API_KEY = "sk-YOUR_OPENAI_API_KEY";

class SimpleSupportAgent {
    private llm: ChatOpenAI;
    private chatPrompt: ChatPromptTemplate;

    constructor(modelName: string = "gpt-3.5-turbo-0125", temperature: number = 0.7) {
        if (!process.env.OPENAI_API_KEY) {
            throw new Error("OPENAI_API_KEY environment variable is not set. Please set your OpenAI API Key.");
        }

        this.llm = new ChatOpenAI({
            modelName: modelName,
            temperature: temperature,
        });

        // Define the system-level prompt template for the support copilot
        const systemPrompt = SystemMessagePromptTemplate.fromTemplate(
            "你是一个专业的智能客服助手，你的职责是解答用户关于产品和服务的疑问，并提供解决方案。 " +
            "你的回答应该友好、清晰、准确，并尽量简洁。如果问题超出你的知识范围，请礼貌地告知用户。"
        );

        // Define the prompt template for user messages
        const humanPrompt = HumanMessagePromptTemplate.fromTemplate("{user_question}");

        // Combine the system prompt and human prompt into a complete chat prompt template
        this.chatPrompt = ChatPromptTemplate.fromMessages([
            systemPrompt,
            humanPrompt,
        ]);
    }

    async getResponse(userQuestion: string): Promise<string> {
        console.log(`\n--- 用户提问 ---\n${userQuestion}`);

        // Format the user query using the chatPrompt template to generate the final Prompt
        const formattedPrompt: BaseMessage[] = await this.chatPrompt.formatMessages({
            user_question: userQuestion,
        });
        console.log(`\n--- 发送给大模型的Prompt (格式化后) ---\n`, formattedPrompt);

        // Call the LLM to get the response
        const response = await this.llm.invoke(formattedPrompt);

        // Extract the response content
        const aiResponseContent = response.content;
        console.log(`\n--- 智能客服回复 ---\n${aiResponseContent}`);
        return String(aiResponseContent); // Ensure a string type is returned
    }
}

// --- Simulate running our support copilot ---
async function main() {
    try {
        // Instantiate our support agent
        const agent = new SimpleSupportAgent("gpt-3.5-turbo", 0.5);

        // Simulate user queries
        await agent.getResponse("我的订单号是 123456789，请问什么时候能发货？");
        await agent.getResponse("你们公司最新的产品是什么？有什么特点？");
        await agent.getResponse("如何申请退货？需要提供哪些资料？");
        await agent.getResponse("请给我讲一个笑话。"); // Test a scenario outside the knowledge scope
    } catch (e: any) {
        console.error(`错误：${e.message}`);
        console.error("请确保已设置 OPENAI_API_KEY 环境变量。");
    }
}

main();

The logic in the TypeScript code is almost identical to the Python version, differing only in syntax. This highlights the brilliance of LangChain's design: it provides a unified, cross-language abstraction, allowing you to build AI applications with a similar mental model regardless of the language you use.

坑与避坑指南

As a veteran, I've seen too many beginners stumble over these issues, so let me give you a heads-up:

API Key Configuration: This is the most common and basic error. The OPENAI_API_KEY environment variable must be set correctly! If you're running code in an IDE, ensure the IDE's runtime environment loads this variable. Sometimes, if you just export it in the terminal, the process launched by the IDE might not inherit it. The safest approach during development is loading it via the dotenv library (not recommended for production), or ensuring your deployment environment is configured correctly.
Model Selection and Cost: gpt-3.5-turbo offers great bang for your buck. However, if you need peak performance and the latest capabilities, you can try gpt-4-turbo or other more powerful models. Keep in mind that more powerful models usually cost more. Starting with gpt-3.5-turbo in the early stages of development and upgrading gradually is a wise move.
Initial Thoughts on Prompt Engineering:
- Clear Persona: What role do you want the AI to play? Support agent, tech expert, sales rep? The clearer you are, the closer the AI's response will align with your expectations.
- Explicit Instructions: What do you want the AI to do? Answer questions, provide advice, summarize? Give explicit instructions.
- Set Boundaries: Tell the AI what it cannot do, or how it should respond in specific situations (e.g., "If the question is outside your knowledge scope, politely inform the user"). This is crucial to prevent the AI from hallucinating or giving irresponsible answers.
- Iterative Optimization: Prompts are rarely perfect on the first try. They require continuous testing and tweaking to achieve optimal results. Treat it as the art of communicating with AI, rather than a one-off task.
Synchronous vs. Asynchronous Calls: In Python, llm.invoke() is a synchronous call. If your application needs to handle high concurrency or you don't want to block the main thread, you should consider using the asynchronous llm.ainvoke(). TypeScript/JavaScript natively supports async, so await agent.getResponse() is the standard approach there.
Unpredictability of Output Formats: Even though we set the response style via SystemMessagePromptTemplate, an LLM is not a deterministic program; it still retains some degree of freedom. Don't expect it to follow your exact format 100% of the time. If you need strictly structured output, we will learn more advanced techniques later (like using Pydantic output parsers).

📝 Session Summary

Congratulations on taking your first step into learning LangChain!

In this session, we:

Gained a deep understanding of LangChain's core value as an LLM orchestration framework, and how it makes developing our "Intelligent Support Knowledge Base" simpler and more efficient.
Mastered two of the most basic yet important LangChain components: LLM (specifically ChatOpenAI) and PromptTemplate.
Built a hands-on support copilot prototype capable of receiving user queries and generating responses based on a preset persona.
Experienced firsthand how to translate abstract AI capabilities into concrete application features through practical coding.
Learned about potential "gotchas" in the development process and the corresponding best practices to avoid them.

This is just the tip of the iceberg! Our current support copilot is still quite "naive." It has no memory, so every question feels like a first-time encounter. It also lacks external knowledge, relying solely on the LLM's general pre-trained knowledge to respond.

In upcoming lessons, we will gradually add "memory" so it can remember user context; connect it to a "knowledge base" so it can answer company-specific product and service questions; and even equip it with "tools" so it can look up orders, send emails, and more.

Ready? In the next session, we'll dive deep into LangChain's Chain mechanism and start weaving the complex logic of our support copilot! Stay curious, stay hungry!