Ep 13: Retaining Memories — Window Buffer Memory & Conversation State Persistence

10 MIN READ | UPDATED: 2026-05-07

Why Agents Need Memory

LLMs are stateless — each call is brand new. Without Memory:

sequenceDiagram
    participant User as 👤 User
    participant AI as 🤖 Memoryless Agent
    User->>AI: My name is Alice
    AI-->>User: Nice to meet you, Alice!
    User->>AI: What's my name?
    AI-->>User: Sorry, I don't know your name. 😅

Memory nodes inject conversation history into the prompt before each LLM call, making the model "appear" to remember.

1. Window Buffer Memory

Maintains a fixed-size sliding window of the most recent N messages.

graph TB
    subgraph "Window Buffer (size = 6)"
        subgraph "Turn 1"
            M1["👤 Hello"] --> M2["🤖 Hi!"]
        end
        subgraph "Turn 2"
            M3["👤 Weather?"] --> M4["🤖 Sunny 28°C"]
        end
        subgraph "Turn 3"
            M5["👤 Tomorrow?"] --> M6["🤖 Cloudy"]
        end
        subgraph "Turn 4 (window slides!)"
            M7["👤 Day after?"] --> M8["🤖 Rainy"]
        end
    end
    M1 -.->|"❌ Evicted"| Trash[🗑️]
    M2 -.->|"❌ Evicted"| Trash
    style Trash fill:#ef4444,stroke:#dc2626,color:#fff

Window Size Guide

Scenario	Window Size	Reason
Quick FAQ	4-6	2-3 turns usually enough
Tech Support	10-20	Need problem context
Deep Consultation	30-50	Full conversation needed

2. Session Isolation

graph TB
    CT[💬 Chat Trigger]
    CT -->|"sessionId: alice"| Agent[🤖 AI Agent]
    CT -->|"sessionId: bob"| Agent
    Agent --> Memory[💾 Memory]
    Memory --> S1["📂 alice: [hello, weather...]"]
    Memory --> S2["📂 bob: [help me code...]"]
    style Memory fill:#22c55e,stroke:#16a34a,color:#fff

3. How Memory Injects Into LLM Calls

// ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
// WITHOUT Memory:
// ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
const messages = [
  { role: "system", content: "You are an AI assistant" },
  { role: "user", content: "What about tomorrow?" }  // No context!
];

// ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
// WITH Memory (auto-injected):
// ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
const messages = [
  { role: "system", content: "You are an AI assistant" },
  { role: "user", content: "Weather in Beijing?" },     // History
  { role: "assistant", content: "Sunny, 28°C" },        // History
  { role: "user", content: "What about tomorrow?" }      // Current
];
// Model now knows "tomorrow" means Beijing weather

Memory Type Comparison

Type	Mechanism	Pros	Cons
Window Buffer	Keep last N messages	Simple, reliable	Loses everything outside window
Token Buffer	Truncate by token count	Precise context control	Slightly complex
Summary	LLM summarizes history	Retains key points longer	May lose details, extra API cost
Vector Store	Embed history into vector DB	True "long-term memory"	Requires vector DB setup

Next Episode

In Ep 14, we equip the Agent with real external Tools — enabling it to not just "talk" but "act": calculate, search, and call APIs.

← PREVIOUS LESSON Ep 12: Birthing a Conversational Bot — Chat Trigger & AI Agent Node Deep Dive

NEXT LESSON → Ep 14: Giving Agents Hands — Connecting Calculator, Wikipedia & Custom Tools