Ep 19: Retrieval Refined — Hybrid Search, Re-Ranking & Multi-Query
Three Pain Points of Basic RAG
graph TB
P1["😤 Semantic Drift
User says 'return', docs say 'refund'"]
P2["😤 Noisy Results
Top-K has irrelevant chunks"]
P3["😤 Single Angle
Complex query needs multi-faceted search"]
P1 --> S1["✅ Hybrid Search"]
P2 --> S2["✅ Re-Ranking"]
P3 --> S3["✅ Multi-Query"]
style S1 fill:#22c55e,stroke:#16a34a,color:#fff
style S2 fill:#22c55e,stroke:#16a34a,color:#fff
style S3 fill:#22c55e,stroke:#16a34a,color:#fff1. Hybrid Search
Combines vector semantic search + keyword search (BM25) — union of both.
graph TB
Query["❓ 'How to return items?'"]
Query --> Semantic["🧮 Vector Search
'return items' ≈ 'refund process'"]
Query --> Keyword["🔤 BM25 Search
Exact match: 'return items'"]
Semantic --> Merge["🔗 Merge + Dedup"]
Keyword --> Merge
Merge --> Result["📑 Comprehensive results"]
style Merge fill:#f59e0b,stroke:#d97706,color:#fff// Qdrant native Hybrid Search with weight tuning:
// vector_weight: 0.7 // Semantic: 70%
// keyword_weight: 0.3 // BM25: 30%
// Tech docs → higher keyword weight (exact terms matter)
// General CS → higher semantic weight (varied user phrasing)
2. Re-Ranking
Fetch Top-20 coarse results, then use a dedicated model to fine-rank for Top-5.
// ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
// Cohere Re-Rank API in n8n Code Node
// ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
const response = await fetch('https://api.cohere.ai/v1/rerank', {
method: 'POST',
headers: {
'Authorization': `Bearer ${$env.COHERE_API_KEY}`,
'Content-Type': 'application/json'
},
body: JSON.stringify({
model: 'rerank-v3.5',
query: query,
documents: chunks,
top_n: 5 // Keep only top 5
})
});
3. Multi-Query
LLM rewrites the original question into 3-5 sub-queries from different angles.
graph TB
Original["❓ 'n8n Webhook not working on AWS'"]
Original --> LLM["🧠 Rewrite into 3 sub-queries"]
LLM --> Q1["🔍 'n8n webhook configuration'"]
LLM --> Q2["🔍 'AWS security group port 5678'"]
LLM --> Q3["🔍 'WEBHOOK_URL environment variable'"]
Q1 & Q2 & Q3 --> VDB["💾 Qdrant"] --> Merge["🔗 Union + Dedup"]
style LLM fill:#8b5cf6,stroke:#7c3aed,color:#fffOptimization Comparison
| Strategy | Complexity | Improvement | Extra Cost | Best For |
|---|---|---|---|---|
| Hybrid | ⭐⭐ | Recall +30% | Minimal | Technical docs |
| Re-Rank | ⭐⭐⭐ | Precision +40% | Cohere API | High-quality needs |
| Multi-Query | ⭐⭐ | Coverage +50% | Extra LLM call | Complex questions |
Next Episode
Ep 20 builds an enterprise RAG management system with auto-incremental updates, stale document cleanup, and retrieval quality monitoring.