News

Enhance OpenAI App Resilience: Add LLM Fallback in 10 Minutes with an API Gateway

Enhance OpenAI App Resilience: Add LLM Fallback in 10 Minutes with an API Gateway

Running a production application on OpenAI presents a critical vulnerability: a single point of failure. If OpenAI's services experience an outage (e.g., returning 500s or 429s), your application becomes unavailable to users, often without immediate recourse or visibility into the failure.

The standard approach involves direct calls to the OpenAI API:

from openai import OpenAI

client = OpenAI(api_key="sk-...")

resp = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[{"role": "user", "content": "Summarise this text..."}],
)

This setup means any issue with OpenAI directly impacts your service, lacking fallback mechanisms or the ability to dynamically route to alternative, potentially cheaper providers when GPT-4 level quality isn't strictly necessary.

The Solution: An API Gateway

InferBridge offers an OpenAI-compatible API gateway designed to address these challenges. By pointing your OpenAI SDK to InferBridge instead of OpenAI directly, the gateway manages intelligent routing, automatic fallback, and per-request observability—all without requiring modifications to your core application logic.

Implementation Steps

Step 1: Obtain an InferBridge Key (One-time setup)

Registering an account with InferBridge provides you with a unique API key, which is essential for authentication with the gateway.

curl -X POST https://api.inferbridge.dev/v1/users \
  -H 'Content-Type: application/json' \
  -d '{"email":"[email protected]"}'

# Example response: {"api_key": "ib_xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx", ...}

This key should be securely saved.

Step 2: Register Your Existing OpenAI Key

Your OpenAI API key needs to be registered with InferBridge. This allows the gateway to make authenticated requests to OpenAI on your behalf.

curl -X POST https://api.inferbridge.dev/v1/keys \
  -H 'Authorization: Bearer ib_xxx...' \
  -H 'Content-Type: application/json' \
  -d '{"provider":"openai","api_key":"sk-..."}'

InferBridge ensures the security of your key by encrypting it at rest using Fernet encryption. It explicitly states that it does not log request content or mark up inference, ensuring direct provider communication.

Step 3: Modify Your Application Code (Two Lines)

Integrate InferBridge into your application by making two minor adjustments to your OpenAI client initialization:

from openai import OpenAI

client = OpenAI(
    api_key="ib_xxx...",                          # Use InferBridge API key
    base_url="https://api.inferbridge.dev/v1",    # Point to InferBridge gateway
)

resp = client.chat.completions.create(
    model="ib/balanced",                          # Use an InferBridge routing tier
    messages=[{"role": "user", "content": "Summarise this text..."}],
)

By updating the api_key to your InferBridge key and setting the base_url, your application will now route requests through InferBridge. Additionally, specifying an InferBridge routing tier (e.g., "ib/balanced") enables the gateway's intelligent fallback and provider selection logic.

InferBridge Routing Tiers Explained

InferBridge employs explicit routing tiers to manage LLM requests, offering different chains of providers based on specific requirements:

  • ib/cheap: Routes through Groq → DeepSeek → Together → Sarvam → OpenAI. Ideal for high-volume, cost-sensitive applications where quality can be more flexible.
  • ib/balanced: Routes through OpenAI → Sarvam → Anthropic. This is the default recommendation for most production applications, balancing cost and performance.
  • ib/premium: Routes through Anthropic → OpenAI. Designed for complex tasks requiring the highest quality and robustness.

This setup provides a robust solution for enhancing the resilience and flexibility of OpenAI-dependent applications by abstracting provider management behind a smart API gateway.

↗ Read original source