第 26 期:集成 Langfuse / Opik 可观测性平台

Updated on 4/6/2026

[Translation Pending]\n\n## 为什么需要外部可观测性?

Dify 内置的 Dashboard 适合基础监控,但企业级场景需要更强大的 Tracing 能力。

graph LR
    Dify[Dify] --> |Traces| LF[Langfuse]
    Dify --> |Traces| Opik[Opik by Comet]
    Dify --> |Traces| Arize[Arize Phoenix]
    
    LF --> Dashboard[可视化面板]
    LF --> Cost[成本分析]
    LF --> Eval[质量评估]
    LF --> Dataset[数据集管理]

集成 Langfuse

步骤一:环境变量配置

# .env 添加
LANGFUSE_PUBLIC_KEY=pk-lf-xxxx
LANGFUSE_SECRET_KEY=sk-lf-xxxx
LANGFUSE_HOST=https://cloud.langfuse.com  # 或自托管地址

步骤二:在 Dify 后台启用

进入 Settings → Integrations → Tracing,选择 Langfuse 并填入 Key。

步骤三:查看 Traces

每次用户请求都会在 Langfuse 中生成完整的 Trace:

{
  "trace_id": "trace-xxx",
  "name": "chat-messages",
  "input": {"query": "如何部署到 K8s?"},
  "output": {"answer": "你可以使用 Helm Chart..."},
  "metadata": {
    "dify_app_id": "app-xxx",
    "user_id": "user-001"
  },
  "spans": [
    {
      "name": "knowledge_retrieval",
      "duration_ms": 450,
      "input": {"query": "如何部署到 K8s?"},
      "output": {"segments": 3}
    },
    {
      "name": "llm_generation",
      "model": "gpt-4o",
      "duration_ms": 2100,
      "usage": {
        "input_tokens": 1500,
        "output_tokens": 800,
        "total_cost_usd": 0.0345
      }
    }
  ]
}

成本追踪

# 使用 Langfuse API 分析成本
from langfuse import Langfuse

lf = Langfuse()

# 获取过去7天的成本数据
traces = lf.get_traces(
    limit=1000,
    from_timestamp="2026-03-29T00:00:00Z"
)

total_cost = sum(
    t.total_cost for t in traces if t.total_cost
)
avg_cost = total_cost / len(traces)

print(f"总成本: ${total_cost:.4f}")
print(f"平均每次请求: ${avg_cost:.6f}")
print(f"总请求数: {len(traces)}")

# 按模型分组
model_costs = {}
for t in traces:
    for obs in t.observations:
        if obs.model:
            model_costs.setdefault(obs.model, 0)
            model_costs[obs.model] += obs.total_cost or 0

for model, cost in sorted(model_costs.items(), key=lambda x: -x[1]):
    print(f"  {model}: ${cost:.4f}")

集成 Opik (Comet)

# .env 配置
OPIK_API_KEY=your-opik-key
OPIK_WORKSPACE=your-workspace
OPIK_PROJECT_NAME=dify-production

可观测性最佳实践

graph TB
    Practice[最佳实践] --> P1[每个 App 独立 Project]
    Practice --> P2[设置成本告警阈值]
    Practice --> P3[定期审查低分 Trace]
    Practice --> P4[用 Trace 数据构建评估集]
    
    P3 --> Fix[发现问题 → 优化 Prompt]
    P4 --> Eval[回归测试 → 防止退化]