Ep 13: Retaining Memories — Window Buffer Memory & Conversation State Persistence
Why Agents Need Memory
LLMs are stateless — each call is brand new. Without Memory:
sequenceDiagram
participant User as 👤 User
participant AI as 🤖 Memoryless Agent
User->>AI: My name is Alice
AI-->>User: Nice to meet you, Alice!
User->>AI: What's my name?
AI-->>User: Sorry, I don't know your name. 😅Memory nodes inject conversation history into the prompt before each LLM call, making the model "appear" to remember.
1. Window Buffer Memory
Maintains a fixed-size sliding window of the most recent N messages.
graph TB
subgraph "Window Buffer (size = 6)"
subgraph "Turn 1"
M1["👤 Hello"] --> M2["🤖 Hi!"]
end
subgraph "Turn 2"
M3["👤 Weather?"] --> M4["🤖 Sunny 28°C"]
end
subgraph "Turn 3"
M5["👤 Tomorrow?"] --> M6["🤖 Cloudy"]
end
subgraph "Turn 4 (window slides!)"
M7["👤 Day after?"] --> M8["🤖 Rainy"]
end
end
M1 -.->|"❌ Evicted"| Trash[🗑️]
M2 -.->|"❌ Evicted"| Trash
style Trash fill:#ef4444,stroke:#dc2626,color:#fffWindow Size Guide
| Scenario | Window Size | Reason |
|---|---|---|
| Quick FAQ | 4-6 | 2-3 turns usually enough |
| Tech Support | 10-20 | Need problem context |
| Deep Consultation | 30-50 | Full conversation needed |
2. Session Isolation
graph TB
CT[💬 Chat Trigger]
CT -->|"sessionId: alice"| Agent[🤖 AI Agent]
CT -->|"sessionId: bob"| Agent
Agent --> Memory[💾 Memory]
Memory --> S1["📂 alice: [hello, weather...]"]
Memory --> S2["📂 bob: [help me code...]"]
style Memory fill:#22c55e,stroke:#16a34a,color:#fff3. How Memory Injects Into LLM Calls
// ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
// WITHOUT Memory:
// ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
const messages = [
{ role: "system", content: "You are an AI assistant" },
{ role: "user", content: "What about tomorrow?" } // No context!
];
// ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
// WITH Memory (auto-injected):
// ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
const messages = [
{ role: "system", content: "You are an AI assistant" },
{ role: "user", content: "Weather in Beijing?" }, // History
{ role: "assistant", content: "Sunny, 28°C" }, // History
{ role: "user", content: "What about tomorrow?" } // Current
];
// Model now knows "tomorrow" means Beijing weather
Memory Type Comparison
| Type | Mechanism | Pros | Cons |
|---|---|---|---|
| Window Buffer | Keep last N messages | Simple, reliable | Loses everything outside window |
| Token Buffer | Truncate by token count | Precise context control | Slightly complex |
| Summary | LLM summarizes history | Retains key points longer | May lose details, extra API cost |
| Vector Store | Embed history into vector DB | True "long-term memory" | Requires vector DB setup |
Next Episode
In Ep 14, we equip the Agent with real external Tools — enabling it to not just "talk" but "act": calculate, search, and call APIs.