Ep 18: Knowledge as a Weapon — Vector Store Tool & RAG Agent in Action
From Indexing to Retrieval
Ep 17 was "injecting knowledge" (indexing). This episode is "extracting knowledge" (retrieval).
1. RAG Agent Workflow
graph TB
CT[💬 Chat Trigger] --> Agent[🤖 AI Agent]
subgraph "Agent Sub-nodes"
Agent --> Model[🧠 GPT-4o]
Agent --> Mem[💾 Memory]
Agent --> VST[🔍 Vector Store Tool → Qdrant]
end
style Agent fill:#ff6d5b,stroke:#e55a4e,color:#fff
style VST fill:#22c55e,stroke:#16a34a,color:#fff2. Vector Store Tool Config
// Tool Name: "search_knowledge_base"
// Description: "Search product docs for features, pricing, tutorials,
// troubleshooting. Input keywords or full question. Returns relevant chunks.
// Do NOT use for general chat."
// Vector Store: Qdrant, Collection: "knowledge-base"
// Top K: 4, Score Threshold: 0.7
// Embedding: text-embedding-3-small ← MUST match indexing model!
3. Full Conversation Sequence
sequenceDiagram
participant User as 👤 User
participant Agent as 🤖 AI Agent
participant LLM as 🧠 GPT-4o
participant VST as 🔍 Vector Store Tool
participant QD as 💾 Qdrant
User->>Agent: "What payment methods are supported?"
Agent->>LLM: Analyze intent + tools
LLM-->>Agent: Tool Call: search_knowledge_base("payment methods")
Agent->>VST: Execute search
VST->>QD: Vector similarity search (Top-4)
QD-->>VST: 4 relevant chunks (scores 0.92, 0.88, 0.81, 0.73)
VST-->>Agent: Return chunks
Agent->>LLM: User question + 4 document chunks
LLM-->>Agent: Grounded answer using real documentation
Agent-->>User: Accurate, cited answer ✅4. RAG Quality Tips
| Optimization | Technique | Effect |
|---|---|---|
| Precision | Raise Score Threshold (0.7→0.8) | Filter low-quality matches |
| Recall | Increase Top-K (4→8) | More candidates for LLM |
| Chunking | Reduce Chunk Size (800→500) | Finer semantic units |
| Filtering | Metadata filters on category | Narrow search scope |
| Hybrid | Vector + keyword search | Dual matching |
Next Episode
Ep 19 covers advanced RAG: Hybrid Search, Re-Ranking, Multi-Query retrieval techniques.