第 20 期 | Agent 评测与反馈：LLM-as-a-judge (EN) — LangChain Masterclass: Zero to Production AI Applications

Welcome back, future Full-Stack AI Masters! I'm your instructor—a 10-year AI veteran and passionate tech educator. I'm thrilled to have you back in the LangChain Masterclass: Zero to Production AI Applications.

Over the past two sessions, we laid the groundwork and got acquainted with LangChain, discovering why it's the "Swiss Army Knife" for building complex LLM applications. But having a great tool isn't enough—we need to know how to wield it and effectively communicate with the underlying AI models.

Today, we are going to demystify the most core, artistic, and often overlooked aspect of AI applications: Prompt Engineering. In our "Intelligent Support Copilot" project, Prompt Engineering is essentially the "soul painter" of our AI assistant. It determines whether our AI is just a parrot repeating words, or a truly intelligent partner capable of understanding user intent and delivering accurate, empathetic service.

🎯 Learning Objectives for This Episode

In this episode, you won't just learn the theory of Prompt Engineering; you will practically apply it to our intelligent support project, achieving a qualitative leap in your AI application skills:

Understand the core principles of Prompt Engineering and its importance in AI apps: Learn how to communicate efficiently with large models through well-designed prompts to achieve specific tasks.
Master the basics of PromptTemplate in LangChain: Learn how to leverage LangChain's powerful tools to build reusable, customizable instructions for our support copilot.
Learn to design structured, clear Prompts: Enhance the copilot's ability to understand user intent, extract key information, and generate precise responses, saying goodbye to irrelevant answers.
Explore Few-shot Prompting: Understand how providing a few examples helps the model better grasp task patterns and expected output formats, further elevating the quality and consistency of support responses.

📖 Core Concepts

Listen up, everyone! If you think Prompt Engineering is just about asking an AI a simple question, you are gravely mistaken. It's not a basic Q&A game; it's the art of communicating with AI. You need to learn how to speak clearly to this incredibly smart "toddler" and articulate your needs flawlessly.

In our "Intelligent Support Copilot" project, users might ask all sorts of questions: "Where is my order?", "How do I apply for a return?", "How do I use my membership points?", or even drop vague complaints. Our support copilot needs to do more than just find answers; it must understand the user's true intent, extract key information, and then respond in a professional, friendly manner that aligns with the brand's tone. Behind all of this is the magic of Prompt Engineering.

What is Prompt Engineering? Simply put, Prompt Engineering is the process of designing and optimizing text instructions (Prompts) fed into Large Language Models (LLMs) to guide them toward generating desired outputs. It encompasses a suite of techniques and strategies aimed at maximizing LLM performance, making it more accurate, relevant, and efficient for specific tasks.

Why is Prompt Engineering Crucial for Intelligent Support?

Intent Understanding: Natural language inputs from users are often ambiguous. Prompts guide the LLM to identify the core intent behind a user's question, such as checking order status, processing returns, or seeking technical support.
Information Extraction: Extracting key entities from user inputs—like order numbers, product names, or issue descriptions—provides the necessary parameters for subsequent knowledge base retrievals or API calls.
Response Generation: Once relevant information is retrieved, prompts instruct the LLM to generate the final reply with a specific tone, format, and content, ensuring professionalism, accuracy, and user-friendliness.
Role-playing & Constraints: Through prompts, we can instruct the LLM to act as a "senior customer service expert" and restrict its scope of answers, preventing it from generating irrelevant or inappropriate content.

A good prompt typically contains the following core elements:

Instruction: Clearly tells the model what to do.
Context: Provides necessary background information to help the model better understand the task.
Input Data: The actual user query or data to be processed.
Output Format: Explicitly defines the expected output structure (e.g., JSON, list, paragraph).
Examples (Few-shot): Provides a few input-output pairs to help the model learn the task pattern.
Constraints: Restricts the model's behavior, such as tone, length, or forbidden topics.

Mermaid Diagram: Prompt Engineering Workflow in Intelligent Support

With these concepts in mind, let's look at the central role Prompt Engineering plays in our support copilot project. It acts as a translator—translating human needs into a language AI understands, and then translating the AI's "thoughts" back into human-readable answers.

graph TD
    A[User Inputs Raw Question] --> B{Prompt Engineering - Intent Understanding}
    B --> C[Prompt Template for Intent Classification]
    C --> D[LLM (e.g., GPT-4)]
    D --> E{Classified Intent: ORDER_STATUS, REFUND, PRODUCT_INFO...}
    E -- Match Intent --> F[Retrieve Info from Knowledge Base/DB]
    F --> G{Prompt Engineering - Response Generation}
    G --> H[Prompt Template for Response Generation]
    H --> I[LLM (e.g., GPT-4)]
    I --> J[Generate Final Support Response]
    J --> K[Return to User]

    subgraph Core Role of Prompt Engineering
        C
        H
    end

    style B fill:#f9f,stroke:#333,stroke-width:2px
    style G fill:#f9f,stroke:#333,stroke-width:2px

As you can see from the diagram, whether it's understanding user intent or generating the final response based on retrieved information, Prompt Engineering plays the core role of the "soul painter." It ensures the AI "gets it" instead of going off the rails.

`PromptTemplate` in LangChain

LangChain provides a highly developer-friendly abstraction for Prompt Engineering: PromptTemplate. It allows us to define a string template with placeholders and dynamically fill them to generate the final prompt. This drastically improves the maintainability and reusability of prompts.

Imagine if, for every user question, you had to manually construct a massive string containing all instructions, context, and the user's query. That would be a nightmare! PromptTemplate is here to save the day.

💻 Hands-on Code Practice (Application in the Copilot Project)

Alright, enough theory—it's time to roll up our sleeves and get to work! We will arm our intelligent support copilot with LangChain's PromptTemplate.

Scenario 1: Intent Classification - Helping the Copilot Understand What the User Wants

In intelligent customer support, the first step is usually understanding the user's intent. For example, if a user asks, "Where is my package?", we should identify this as an "order status inquiry"; if they ask, "I bought the wrong item, can I return it?", it's a "return policy consultation."

We will build a Prompt Template to have the LLM help us classify intents.

# Python Code Example
from langchain_core.prompts import PromptTemplate
from langchain_openai import ChatOpenAI
from langchain_core.output_parsers import StrOutputParser

import os
# Set your OpenAI API Key
# In a real project, use environment variables or a more secure configuration method
# os.environ["OPENAI_API_KEY"] = "YOUR_OPENAI_API_KEY" # Replace with your actual API Key

print("--- Scenario 1: Intent Classification - Helping the Copilot Understand What the User Wants ---")

# 1. Define the PromptTemplate for intent classification
# We will give the model clear instructions, tell it the possible intents, and ask it to return ONLY the intent name
intent_classification_template = """
You are a professional e-commerce customer support intent classification bot. Your task is to categorize the user's query into the most appropriate intent category.
Please select one from the available categories below and return ONLY the category name. If no category matches, return "UNKNOWN".

Available categories:
- ORDER_STATUS: Check order status, logistics info
- REFUND_RETURN: Apply for return/refund, check return policy
- PRODUCT_INFO: Check product details, inventory, features, recommendations
- ACCOUNT_MANAGEMENT: Account issues, password reset, personal info modification
- TECHNICAL_SUPPORT: Technical glitches, website usage issues
- COMPLAINT_SUGGESTION: Complaints, suggestions, feedback
- OTHER: Other general questions not belonging to the above categories

User query: "{user_query}"
Please select the most matching intent category:
"""

# Create a formatable Prompt object using PromptTemplate
intent_prompt = PromptTemplate.from_template(intent_classification_template)

# 2. Initialize the LLM
# Here we use OpenAI's GPT-3.5-turbo model
# The temperature parameter is set to 0, meaning we want deterministic, consistent answers, which is crucial for classification tasks
llm = ChatOpenAI(model="gpt-3.5-turbo", temperature=0)

# 3. Build the LangChain chain
# PromptTemplate | LLM | OutputParser
intent_chain = intent_prompt | llm | StrOutputParser()

# 4. Simulate user queries and get intent classifications
user_queries = [
    "Where is my package?",
    "The clothes I bought are the wrong size, how do I return them?",
    "How is the battery life of this phone?",
    "My account seems to be hacked, I can't log in.",
    "Your website is too laggy, can you optimize it?",
    "What are you having for dinner today?", # An irrelevant question
    "I have a question about the product material." # Another product info related question
]

print("\n--- Starting Intent Classification Test ---")
for query in user_queries:
    # Use chain.invoke() to execute the chain, passing in the user query
    classified_intent = intent_chain.invoke({"user_query": query})
    print(f"User Query: '{query}' -> Classified Intent: '{classified_intent}'")

# TypeScript Code Example (Conceptual demo, requires environment setup to run)
"""
// import { PromptTemplate } from "@langchain/core/prompts";
// import { ChatOpenAI } from "@langchain/openai";
// import { StringOutputParser } from "@langchain/core/output_parsers";

// // Set your OpenAI API Key
// // process.env.OPENAI_API_KEY = "YOUR_OPENAI_API_KEY";

// console.log("--- Scenario 1: Intent Classification - Helping the Copilot Understand What the User Wants ---");

// const intentClassificationTemplate = `
// You are a professional e-commerce customer support intent classification bot. Your task is to categorize the user's query into the most appropriate intent category.
// Please select one from the available categories below and return ONLY the category name. If no category matches, return "UNKNOWN".

// Available categories:
// - ORDER_STATUS: Check order status, logistics info
// - REFUND_RETURN: Apply for return/refund, check return policy
// - PRODUCT_INFO: Check product details, inventory, features, recommendations
// - ACCOUNT_MANAGEMENT: Account issues, password reset, personal info modification
// - TECHNICAL_SUPPORT: Technical glitches, website usage issues
// - COMPLAINT_SUGGESTION: Complaints, suggestions, feedback
// - OTHER: Other general questions not belonging to the above categories

// User query: "{user_query}"
// Please select the most matching intent category:
// `;

// const intentPrompt = PromptTemplate.fromTemplate(intentClassificationTemplate);

// const llm = new ChatOpenAI({ modelName: "gpt-3.5-turbo", temperature: 0 });

// const intentChain = intentPrompt.pipe(llm).pipe(new StringOutputParser());

// const userQueries = [
//     "Where is my package?",
//     "The clothes I bought are the wrong size, how do I return them?",
//     "How is the battery life of this phone?",
//     "My account seems to be hacked, I can't log in.",
//     "Your website is too laggy, can you optimize it?",
//     "What are you having for dinner today?",
//     "I have a question about the product material."
// ];

// async function runIntentClassification() {
//     console.log("\n--- Starting Intent Classification Test ---");
//     for (const query of userQueries) {
//         const classifiedIntent = await intentChain.invoke({ user_query: query });
//         console.log(`User Query: '${query}' -> Classified Intent: '${classifiedIntent}'`);
//     }
// }

// runIntentClassification();
"""

Example Output:

--- Scenario 1: Intent Classification - Helping the Copilot Understand What the User Wants ---

--- Starting Intent Classification Test ---
User Query: 'Where is my package?' -> Classified Intent: 'ORDER_STATUS'
User Query: 'The clothes I bought are the wrong size, how do I return them?' -> Classified Intent: 'REFUND_RETURN'
User Query: 'How is the battery life of this phone?' -> Classified Intent: 'PRODUCT_INFO'
User Query: 'My account seems to be hacked, I can't log in.' -> Classified Intent: 'ACCOUNT_MANAGEMENT'
User Query: 'Your website is too laggy, can you optimize it?' -> Classified Intent: 'COMPLAINT_SUGGESTION'
User Query: 'What are you having for dinner today?' -> Classified Intent: 'UNKNOWN'
User Query: 'I have a question about the product material.' -> Classified Intent: 'PRODUCT_INFO'

See that? With a simple PromptTemplate, we successfully guided the LLM to perfectly execute the intent classification task! That's the magic of Prompt Engineering.

Scenario 2: Response Generation - Helping the Copilot Give Professional, Friendly Replies

In intelligent customer support, merely classifying the intent isn't enough. We also need to generate a helpful, polite response based on the user's intent and the information retrieved from the knowledge base.

Suppose we have identified the user intent as ORDER_STATUS and have queried the order information from the database (e.g., Order ID 123456, Status Shipped, Estimated Delivery 2023-10-26). Now, let's build a Prompt Template to have the LLM generate the final support response.

# Python Code Example
print("\n--- Scenario 2: Response Generation - Helping the Copilot Give Professional, Friendly Replies ---")

# 1. Define the PromptTemplate for response generation
# We will give the model a customer service role, along with specific reply requirements and formats
response_generation_template = """
You are a friendly and professional e-commerce customer support agent. Your task is to generate a polite and clear order status reply for the user based on the provided order information.
Please ensure the reply includes the following:
1. Greet the user first.
2. Clearly state the order ID.
3. Clearly explain the current order status.
4. If an estimated delivery date is provided, include it as well.
5. Provide next-step advice or reassurance, such as "Please wait patiently" or "Feel free to contact us if you have any other questions".
6. The tone of the reply should be positive and friendly.

Order Information:
Order ID: {order_id}
Order Status: {order_status}
Estimated Delivery Date: {delivery_date}

User Query: "{user_query}"

Please generate the support response:
"""

response_prompt = PromptTemplate.from_template(response_generation_template)

# 2. LLM remains the same, or adjust the temperature as needed (can be slightly higher to make the reply more natural)
llm_for_response = ChatOpenAI(model="gpt-3.5-turbo", temperature=0.7)

# 3. Build the LangChain chain
response_chain = response_prompt | llm_for_response | StrOutputParser()

# 4. Simulate retrieved order info and generate responses
order_data_1 = {
    "order_id": "EC20231025001",
    "order_status": "Shipped",
    "delivery_date": "October 28, 2023",
    "user_query": "Where is my order EC20231025001?"
}

order_data_2 = {
    "order_id": "EC20231024005",
    "order_status": "Packing",
    "delivery_date": "No exact date yet, expected to ship within 3-5 business days",
    "user_query": "When will my order EC20231024005 be shipped?"
}

print("\n--- Starting Response Generation Test ---")

print(f"\nUser Query: '{order_data_1['user_query']}'")
generated_response_1 = response_chain.invoke(order_data_1)
print("Support Response:\n", generated_response_1)

print(f"\nUser Query: '{order_data_2['user_query']}'")
generated_response_2 = response_chain.invoke(order_data_2)
print("Support Response:\n", generated_response_2)

# TypeScript Code Example (Conceptual demo, requires environment setup to run)
"""
// import { PromptTemplate } from "@langchain/core/prompts";
// import { ChatOpenAI } from "@langchain/openai";
// import { StringOutputParser } from "@langchain/core/output_parsers";

// // process.env.OPENAI_API_KEY = "YOUR_OPENAI_API_KEY";

// console.log("\n--- Scenario 2: Response Generation - Helping the Copilot Give Professional, Friendly Replies ---");

// const responseGenerationTemplate = `
// You are a friendly and professional e-commerce customer support agent. Your task is to generate a polite and clear order status reply for the user based on the provided order information.
// Please ensure the reply includes the following:
// 1. Greet the user first.
// 2. Clearly state the order ID.
// 3. Clearly explain the current order status.
// 4. If an estimated delivery date is provided, include it as well.
// 5. Provide next-step advice or reassurance, such as "Please wait patiently" or "Feel free to contact us if you have any other questions".
// 6. The tone of the reply should be positive and friendly.

// Order Information:
// Order ID: {order_id}
// Order Status: {order_status}
// Estimated Delivery Date: {delivery_date}

// User Query: "{user_query}"

// Please generate the support response:
// `;

// const responsePrompt = PromptTemplate.fromTemplate(responseGenerationTemplate);

// const llmForResponse = new ChatOpenAI({ modelName: "gpt-3.5-turbo", temperature: 0.7 });

// const responseChain = responsePrompt.pipe(llmForResponse).pipe(new StringOutputParser());

// const orderData1 = {
//     order_id: "EC20231025001",
//     order_status: "Shipped",
//     delivery_date: "October 28, 2023",
//     user_query: "Where is my order EC20231025001?"
// };

// const orderData2 = {
//     order_id: "EC20231024005",
//     order_status: "Packing",
//     delivery_date: "No exact date yet, expected to ship within 3-5 business days",
//     user_query: "When will my order EC20231024005 be shipped?"
// };

// async function runResponseGeneration() {
//     console.log("\n--- Starting Response Generation Test ---");

//     console.log(`\nUser Query: '${orderData1.user_query}'`);
//     const generatedResponse1 = await responseChain.invoke(orderData1);
//     console.log("Support Response:\n", generatedResponse1);

//     console.log(`\nUser Query: '${orderData2.user_query}'`);
//     const generatedResponse2 = await responseChain.invoke(orderData2);
//     console.log("Support Response:\n", generatedResponse2);
// }

// runResponseGeneration();
"""

Example Output:

--- Scenario 2: Response Generation - Helping the Copilot Give Professional, Friendly Replies ---

--- Starting Response Generation Test ---

User Query: 'Where is my order EC20231025001?'
Support Response:
 Hello! The current status of your order EC20231025001 is [Shipped], and it is estimated to be delivered on October 28, 2023. Please wait patiently; we will notify you promptly once the logistics information is updated. If you have any other questions, feel free to contact us anytime!

User Query: 'When will my order EC20231024005 be shipped?'
Support Response:
 Hello! The current status of your order EC20231024005 is [Packing]. We apologize that there is no exact delivery date at the moment, but it is expected to ship within 3-5 business days. Please rest assured that we will process and arrange the shipment as soon as possible. Thank you for your patience!

Isn't that cool? Our support copilot can now not only identify intents but also generate personalized, professional, and friendly replies based on specific information! This is the power of Prompt Engineering. Through carefully crafted prompts, it's as if we've injected a soul into the AI, making it truly "empathetic."

Scenario 3: Few-shot Prompting - Leading by Example for Faster, More Accurate Learning

Sometimes, merely describing a task through instructions isn't enough; the model might fail to fully capture the nuances or the exact output format we expect. This is where Few-shot Prompting comes in handy. By providing a small number of high-quality input-output examples, we can directly show the model "how you want it done."

For an intelligent support copilot, Few-shot Prompting can be used to:

Ensure consistent reply formats for specific types of questions.
Guide the model to use specific vocabulary or tone in particular situations.
Help the model handle edge cases or complex user expressions.

# Python Code Example
from langchain_core.prompts import FewShotPromptTemplate, PromptTemplate

print("\n--- Scenario 3: Few-shot Prompting - Leading by Example for Faster, More Accurate Learning ---")

# 1. Define Few-shot examples
# Suppose we want the model to give a warmer, more humanized reply when the user expresses gratitude, rather than a simple "You're welcome"
examples = [
    {
        "input": "Thank you so much for your help, the issue is resolved!",
        "output": "I'm so glad I could help! If you have any further questions in the future, please feel free to contact us. We are always happy to assist!"
    },
    {
        "input": "Thanks, your service is awesome!",
        "output": "Thank you for your recognition, it's our greatest motivation! We look forward to providing you with even better service!"
    },
    {
        "input": "That's amazing, thanks!",
        "output": "You're very welcome, I'm glad I could help! Wishing you a wonderful day!"
    },
]

# 2. Define the Prompt Template for a single example
# This template is used to format the input and output of each example
example_formatter_template = """
User: {input}
Support: {output}
"""
example_prompt = PromptTemplate.from_template(example_formatter_template)

# 3. Define the FewShotPromptTemplate
# It will combine all examples with the user's actual query
few_shot_prompt = FewShotPromptTemplate(
    examples=examples,                # List of examples
    example_prompt=example_prompt,    # Prompt Template to format the examples
    prefix="You are a friendly and helpful customer support bot. Please refer to the following dialogue examples and reply to the user in a similar style and tone:\n", # Prefix instruction
    suffix="\nUser: {user_input}\nSupport:", # Suffix, containing the actual user input and the part the model needs to complete
    input_variables=["user_input"],   # User input variables
    example_separator="\n\n"          # Separator between examples
)

# 4. Initialize LLM (can slightly increase temperature to make replies more creative)
llm_for_fewshot = ChatOpenAI(model="gpt-3.5-turbo", temperature=0.7)

# 5. Build the LangChain chain
fewshot_chain = few_shot_prompt | llm_for_fewshot | StrOutputParser()

# 6. Simulate user input and get responses
user_input_1 = "Thank you for your patient answers!"
user_input_2 = "Your customer service is so efficient, thanks!"

print("\n--- Starting Few-shot Response Generation Test ---")

print(f"\nUser Input: '{user_input_1}'")
generated_fewshot_response_1 = fewshot_chain.invoke({"user_input": user_input_1})
print("Support Response:\n", generated_fewshot_response_1)

print(f"\nUser Input: '{user_input_2}'")
generated_fewshot_response_2 = fewshot_chain.invoke({"user_input": user_input_2})
print("Support Response:\n", generated_fewshot_response_2)

# TypeScript Code Example (Conceptual demo, requires environment setup to run)
"""
// import { FewShotPromptTemplate, PromptTemplate } from "@langchain/core/prompts";
// import { ChatOpenAI } from "@langchain/openai";
// import { StringOutputParser } from "@langchain/core/output_parsers";

// // process.env.OPENAI_API_KEY = "YOUR_OPENAI_API_KEY";

// console.log("\n--- Scenario 3: Few-shot Prompting - Leading by Example for Faster, More Accurate Learning ---");

// const examples = [
//     {
//         input: "Thank you so much for your help, the issue is resolved!",
//         output: "I'm so glad I could help! If you have any further questions in the future, please feel free to contact us. We are always happy to assist!"
//     },
//     {
//         input: "Thanks, your service is awesome!",
//         output: "Thank you for your recognition, it's our greatest motivation! We look forward to providing you with even better service!"
//     },
//     {
//         input: "That's amazing, thanks!",
//         output: "You're very welcome, I'm glad I could help! Wishing you a wonderful day!"
//     },
// ];

// const examplePrompt = PromptTemplate.fromTemplate(`
//
"""

第 20 期 | Agent 评测与反馈：LLM-as-a-judge (EN)