Issue 04 | Memory System: Equipping Customer Service Bots with Contextual Memory — LangChain Masterclass: Zero to Production AI Applications

🎯 Learning Objectives for this Session

Hey there, future AI architects! I'm your old friend, a ten-year veteran in the AI circle, and your most enthusiastic mentor. Today, we are officially kicking off this hardcore LangChain journey. Don't be nervous; we will start from the very basics and guide you step-by-step to build your first production-grade AI application—our "Smart Customer Service Knowledge Base".

By the end of this session, you will achieve the following objectives:

Understand the Core Value of LangChain: Grasp why LangChain is the "Swiss Army knife" for building complex LLM applications and what pain points it solves.
Master Basic LangChain Components: Become familiar with LLM and PromptTemplate, which act as the "mouth" and "script" for your conversations with large models.
Build a Smart Customer Service Prototype: Hands-on build a customer service assistant capable of receiving user questions and generating initial responses, experiencing the thrill of going from zero to one.
Establish a Developer Mindset: Learn how to materialize abstract AI capabilities into specific features within our customer service project, laying the groundwork for more complex functionalities later.

📖 Principle Analysis

Alright, enough chit-chat, let's get straight to business.

Imagine you are developing a smart customer service bot. A user asks: "What is my order status?" or "How do I apply for a return?". You can't just throw the user's question directly at the large model and expect it to immediately provide an accurate answer that complies with company policies, right? Although large models are powerful, they lack your business knowledge and the awareness to play the role of a customer service agent.

This is where LangChain shines. It is not a large model itself, but an orchestration framework—a "conductor" that helps you organize "instruments" like large models, data sources, and external tools to play a beautiful symphony together.

In our "Smart Customer Service Knowledge Base" project, the core value of LangChain lies in:

Structured Input: User questions come in all shapes and sizes. LangChain helps us "translate" these questions into instructions that the large model can understand and process effectively.
Unified Interface: Whether it's OpenAI's GPT series, Google's Gemini, or a locally deployed Llama 2, LangChain can call them using a unified set of interfaces, allowing you to switch at any time without refactoring your code.
Modular Design: It breaks down the development process of large model applications into independent, reusable components (such as LLM, PromptTemplate, Chain, Memory, Tool, etc.). Like Lego bricks, you can freely combine them to build infinite possibilities.

In this session, we will focus on the two most basic yet most important components:

LLM (Large Language Model): This is the cornerstone of your interaction with large models. It encapsulates the calling logic for various large models, so you don't have to worry about the underlying API request details. You just need to tell it which model to use and pass in your input.
PromptTemplate: This is where you "write the script" for the large model. It allows you to define a template with placeholders and then dynamically fill in the content to generate the final prompt sent to the large model. This is crucial for ensuring the large model plays a specific role and follows a specific output format. Especially in our customer service scenario, we want the assistant to answer questions in a professional and friendly tone.

Now, let's use a simplified Mermaid diagram to understand how these two components collaborate in our smart customer service project to complete a basic Q&A flow.

graph TD
    A[User Input: "I have a question about my order"] --> B{PromptTemplate: Set customer service role and question format}
    B --> C[Generate Complete Prompt: "You are a professional smart customer service agent. Please provide help based on the user's question. User question: 'I have a question about my order'"]
    C --> D[LLM: Call large model (e.g., GPT-3.5-turbo)]
    D --> E[Large Model Output: "Hello! What specific information would you like to know about your order?"]
    E --> F[Smart Customer Service Replies to User]

Principle Breakdown:

User Input: The user asks a question on the customer service interface.
PromptTemplate Processing: Our PromptTemplate will pre-define the "persona" of the customer service assistant (e.g., "You are a professional smart customer service assistant. Your duty is to answer user questions about products and services and provide solutions.") as well as output format requirements. Then, it will fill the user's specific question into this preset template.
Generate Complete Prompt: After processing by the PromptTemplate, a structured, complete prompt containing the role setting and the specific question is born.
LLM Invocation: This complete prompt is passed to the LLM component. The LLM is responsible for communicating with the actual large model (e.g., OpenAI's gpt-3.5-turbo), sending the request, and receiving the response.
Large Model Output: The large model generates a corresponding response based on the received prompt.
Smart Customer Service Reply: The smart customer service displays the large model's response to the user.

See that? LangChain is like a skilled chef. It takes the user's disorganized "ingredients" (questions), guides them through a "recipe" (PromptTemplate), puts them into the "oven" (LLM), and finally serves a delicious "dish" (answer). This is much better than just throwing raw meat directly into the oven!

💻 Practical Code Drill (Specific Application in the Customer Service Project)

Alright, no matter how wonderfully the principles are explained, nothing beats writing code with your own hands. Now, let's build the "Hello World" version of our smart customer service. We will use Python for the demonstration because it is more popular in the AI field, but I will also provide a TypeScript code example later so you can experience LangChain's cross-language charm.

Environment Setup

First, you need to install LangChain and the client for your chosen large model provider. Here, we will use OpenAI as an example.

# Python Environment
pip install langchain-openai

# TypeScript / JavaScript Environment
npm install langchain @langchain/openai

Don't forget to set your OpenAI API Key. The safest way is to set it as an environment variable.

# Execute in your terminal (macOS/Linux)
export OPENAI_API_KEY="sk-YOUR_OPENAI_API_KEY"

# Execute in your terminal (Windows PowerShell)
$env:OPENAI_API_KEY="sk-YOUR_OPENAI_API_KEY"

Python Code Practice

We will build a simple SimpleSupportAgent class that can receive user questions and generate responses using LangChain's ChatOpenAI and PromptTemplate.

import os
from langchain_openai import ChatOpenAI
from langchain_core.prompts import ChatPromptTemplate, HumanMessagePromptTemplate, SystemMessagePromptTemplate
from langchain_core.messages import HumanMessage, SystemMessage

class SimpleSupportAgent:
    """
    A basic smart customer service agent used to demonstrate LangChain's LLM and PromptTemplate components.
    It can receive user questions and generate responses based on a preset customer service role.
    """
    def __init__(self, model_name: str = "gpt-3.5-turbo-0125", temperature: float = 0.7):
        """
        Initialize the smart customer service agent.
        :param model_name: The name of the OpenAI model to use.
        :param temperature: The creativity level of the model (between 0 and 1, higher means more creative).
        """
        # Check if the API Key is set
        if not os.getenv("OPENAI_API_KEY"):
            raise ValueError("The OPENAI_API_KEY environment variable is not set. Please set your OpenAI API Key first.")

        # Initialize the ChatOpenAI model instance
        # This is the "brain" of our customer service assistant, responsible for understanding and generating text
        self.llm = ChatOpenAI(model_name=model_name, temperature=temperature)

        # Define the system-level prompt template for the customer service assistant
        # This is like setting a "script" for the assistant, telling it what role to play
        self.system_prompt = SystemMessagePromptTemplate.from_template(
            "You are a professional smart customer service assistant. Your duty is to answer user questions about products and services and provide solutions. "
            "Your answers should be friendly, clear, accurate, and as concise as possible. If a question is beyond your knowledge scope, please politely inform the user."
        )

        # Define the prompt template for user messages
        # This is the user input part, which will be dynamically filled into the complete Prompt
        self.human_prompt = HumanMessagePromptTemplate.from_template("{user_question}")

        # Combine the system prompt and user prompt into a complete chat prompt template
        # This is the complete "script" we show to the large model, including the role setting and the specific user question
        self.chat_prompt = ChatPromptTemplate.from_messages([
            self.system_prompt,
            self.human_prompt
        ])

    def get_response(self, user_question: str) -> str:
        """
        Generate a customer service response based on the user's question.
        :param user_question: The question asked by the user.
        :return: The response from the customer service assistant.
        """
        print(f"\n--- User Question ---\n{user_question}")

        # Format the user question using the chat_prompt template to generate the final Prompt
        # This step fills the user question into our preset "script"
        formatted_prompt = self.chat_prompt.format_messages(user_question=user_question)
        print(f"\n--- Prompt Sent to Large Model (Formatted) ---\n{formatted_prompt}")

        # Call the LLM model to get the response
        # The large model generates an answer based on the formatted Prompt
        response = self.llm.invoke(formatted_prompt)
        
        # Extract the response content
        ai_response_content = response.content
        print(f"\n--- Smart Customer Service Reply ---\n{ai_response_content}")
        return ai_response_content

# --- Simulate running our smart customer service ---
if __name__ == "__main__":
    try:
        # Instantiate our smart customer service agent
        # You can try different models or temperatures
        agent = SimpleSupportAgent(model_name="gpt-3.5-turbo", temperature=0.5)

        # Simulate user questions
        agent.get_response("My order number is 123456789, when will it be shipped?")
        agent.get_response("What is your company's latest product? What are its features?")
        agent.get_response("How do I apply for a return? What information do I need to provide?")
        agent.get_response("Please tell me a joke.") # Test a scenario beyond the knowledge scope
    except ValueError as e:
        print(f"Error: {e}")
        print("Please ensure the OPENAI_API_KEY environment variable is set.")
    except Exception as e:
        print(f"An unknown error occurred: {e}")

Code Breakdown:

ChatOpenAI: We instantiated a ChatOpenAI object, specifying the model name (gpt-3.5-turbo-0125 is a newer version and is recommended) and temperature. temperature determines the "randomness" or "creativity" of the model's responses. For customer service scenarios, we usually want it to be more stable and accurate, so we set it to 0.5 or lower.
SystemMessagePromptTemplate: This is the key to defining the "persona" of the customer service assistant. Through a template, we tell the large model that it is a "professional smart customer service assistant" and define its duties and response style. This is like putting a uniform on the customer service assistant and clarifying its job responsibilities.
HumanMessagePromptTemplate: This is the placeholder for user input. {user_question} will be replaced by the user's specific question during the actual invocation.
ChatPromptTemplate.from_messages: Combines the system prompt and the user prompt to form a complete conversation flow. LangChain automatically handles the formatting of these messages to meet the requirements of the large model API.
chat_prompt.format_messages: In the get_response method, we use this method to fill the user's question into the template, generating the final list of Message objects to be sent to the large model.
self.llm.invoke(formatted_prompt): This is where the large model is actually called. The invoke method receives the formatted prompt, sends it to the large model behind the ChatOpenAI instance, and returns the large model's response.
response.content: Extracts the text content we need from the large model's response object.

Run this code, and you will see how the smart customer service gives corresponding responses to different questions based on the role you set. Even for questions beyond its knowledge scope, it will respond politely. This is exactly where SystemMessagePromptTemplate comes into play!

TypeScript / JavaScript Code Practice (Optional)

If you are a frontend developer or prefer TypeScript, here is the corresponding implementation:

import { ChatOpenAI } from "@langchain/openai";
import { ChatPromptTemplate, SystemMessagePromptTemplate, HumanMessagePromptTemplate } from "@langchain/core/prompts";
import { BaseMessage } from "@langchain/core/messages";

// Ensure you have set the OPENAI_API_KEY environment variable
// Example: process.env.OPENAI_API_KEY = "sk-YOUR_OPENAI_API_KEY";

class SimpleSupportAgent {
    private llm: ChatOpenAI;
    private chatPrompt: ChatPromptTemplate;

    constructor(modelName: string = "gpt-3.5-turbo-0125", temperature: number = 0.7) {
        if (!process.env.OPENAI_API_KEY) {
            throw new Error("OPENAI_API_KEY environment variable is not set. Please set your OpenAI API Key.");
        }

        this.llm = new ChatOpenAI({
            modelName: modelName,
            temperature: temperature,
        });

        // Define the system-level prompt template for the customer service assistant
        const systemPrompt = SystemMessagePromptTemplate.fromTemplate(
            "You are a professional smart customer service assistant. Your duty is to answer user questions about products and services and provide solutions. " +
            "Your answers should be friendly, clear, accurate, and as concise as possible. If a question is beyond your knowledge scope, please politely inform the user."
        );

        // Define the prompt template for user messages
        const humanPrompt = HumanMessagePromptTemplate.fromTemplate("{user_question}");

        // Combine the system prompt and user prompt into a complete chat prompt template
        this.chatPrompt = ChatPromptTemplate.fromMessages([
            systemPrompt,
            humanPrompt,
        ]);
    }

    async getResponse(userQuestion: string): Promise<string> {
        console.log(`\n--- User Question ---\n${userQuestion}`);

        // Format the user question using the chatPrompt template to generate the final Prompt
        const formattedPrompt: BaseMessage[] = await this.chatPrompt.formatMessages({
            user_question: userQuestion,
        });
        console.log(`\n--- Prompt Sent to Large Model (Formatted) ---\n`, formattedPrompt);

        // Call the LLM model to get the response
        const response = await this.llm.invoke(formattedPrompt);

        // Extract the response content
        const aiResponseContent = response.content;
        console.log(`\n--- Smart Customer Service Reply ---\n${aiResponseContent}`);
        return String(aiResponseContent); // Ensure a string type is returned
    }
}

// --- Simulate running our smart customer service ---
async function main() {
    try {
        // Instantiate our smart customer service agent
        const agent = new SimpleSupportAgent("gpt-3.5-turbo", 0.5);

        // Simulate user questions
        await agent.getResponse("My order number is 123456789, when will it be shipped?");
        await agent.getResponse("What is your company's latest product? What are its features?");
        await agent.getResponse("How do I apply for a return? What information do I need to provide?");
        await agent.getResponse("Please tell me a joke."); // Test a scenario beyond the knowledge scope
    } catch (e: any) {
        console.error(`Error: ${e.message}`);
        console.error("Please ensure the OPENAI_API_KEY environment variable is set.");
    }
}

main();

The code logic in TypeScript is almost identical to the Python version, with only syntactical differences. This is exactly the brilliance of LangChain's design: it provides a unified cross-language abstraction, allowing you to build AI applications with a similar mindset regardless of the language you use.

Pitfalls and Troubleshooting Guide

As a veteran developer, I've seen too many beginners stumble in these areas, so I'm giving you a heads-up in advance:

API Key Configuration Issues: This is the most common and basic error. The OPENAI_API_KEY environment variable must be set correctly! If you are running in an IDE, ensure the IDE's runtime environment also loads this variable. Sometimes, after directly using export in the terminal, the process started by the IDE might not inherit it. The safest way is to load it via the dotenv library in your code (not recommended for production) or ensure your deployment environment is configured correctly.
Model Selection and Cost: gpt-3.5-turbo is a highly cost-effective choice, but if you are pursuing ultimate performance and the latest capabilities, you can try gpt-4-turbo or other more powerful models. However, please note that the more powerful the model, the higher the cost usually is. In the early stages of development, starting with gpt-3.5-turbo and gradually upgrading is a wise move.
Initial Thoughts on Prompt Engineering:
- Clear Role Setting: What role do you want the AI to play? Customer service, technical expert, sales? The clearer it is, the more the AI's response will meet expectations.
- Explicit Instructions: What do you want the AI to do? Answer questions, provide suggestions, summarize? Give explicit instructions.
- Define Boundaries: Tell the AI what it cannot do, or how it should respond under certain circumstances (e.g., "If the question is beyond your knowledge scope, please politely inform the user"). This is crucial for preventing the AI from "hallucinating" or giving irresponsible answers.
- Iterative Optimization: A Prompt is not achieved overnight; it requires continuous testing and adjustment to reach the best effect. Treat it as the art of communicating with AI, rather than a one-time task.
Synchronous/Asynchronous Invocation: In Python, llm.invoke() is a synchronous call. If your application needs to handle a large number of concurrent requests or you don't want to block the main thread, you need to consider using the asynchronous call llm.ainvoke(). TypeScript/JavaScript naturally supports async, so await agent.getResponse() is the standard practice.
Uncontrollability of Output Format: Although we set the response style via SystemMessagePromptTemplate, a large model is ultimately not a rigid program; it still has a certain degree of freedom. Do not expect it to follow your exact format 100% of the time. If strict structured output is required, we will learn more advanced techniques later (such as using Pydantic output parsers).

📝 Session Summary

Congratulations on taking your first step in learning LangChain!

In this session, we:

Deeply understood the core value of LangChain as an LLM orchestration framework, and how it makes the development of our "Smart Customer Service Knowledge Base" project simpler and more efficient.
Mastered the two most basic yet most important components of LangChain: LLM (especially ChatOpenAI) and PromptTemplate.
Hands-on built a smart customer service prototype capable of receiving user questions and generating responses based on a preset customer service role.
Intuitively experienced how to translate abstract AI capabilities into specific application features through practical code.
Learned in advance about the "pitfalls" you might encounter during development and the corresponding "troubleshooting guide".

This is just the tip of the iceberg! The current customer service assistant is still quite "naive"; it has no memory, and every question feels like a first meeting. It also lacks external knowledge and can only reply based on the large model's own general knowledge.

In the upcoming lessons, we will gradually add "memory" to it so it can remember the user's context; connect it to a "knowledge base" so it can answer questions specific to our company's products and services; and even equip it with "tools" so it can query orders, send emails, and more.

Are you ready? In the next session, we will dive deep into LangChain's Chain mechanism and start weaving the complex logic of our customer service assistant! Stay curious, stay hungry!