AI Gateway

Every LLM call governed. Zero code changes.

A 5-step safety pipeline between your agents and their LLM providers. Rate limiting, cost controls, PII scanning, model allowlists, and human-in-the-loop approvals — applied to every request, automatically.

Request Access Read the Docs

AI Gateway — data streams flowing through a secure governance checkpoint

Swap one base URL. That’s it.

Point your existing OpenAI, Anthropic, or Google SDK at our gateway. Every call flows through the safety pipeline before reaching the provider.

BeforeDirect to provider

OPENAI_BASE_URL=https://api.openai.com/v1

No governance. No cost visibility. No guardrails.

AfterThrough Curate-Me

OPENAI_BASE_URL=https://api.curate-me.ai/v1/openai
X-CM-API-Key: cm_sk_xxx

Full safety pipeline active.

Works with any SDK that supports a custom base URL — OpenAI Python, LangChain, LlamaIndex, Vercel AI SDK, and more.

The 5-step safety pipeline.

Every request passes through five safety checks before reaching the LLM provider. Each step can short-circuit and deny the request immediately.

Request

Provider

Click any step above to see how it works.

Each step can short-circuit the chain and deny the request immediately. If a request fails rate limiting, it never reaches cost estimation. This keeps latency low and prevents unnecessary work.

Rate Limit

Per-org, per-key request throttling.

Cost Estimate

Estimated cost vs. per-request and daily budget.

PII Scan

Regex scan for secrets and PII in request content.

Model Allowlist

Enforce allowed models per organization.

HITL Gate

Flag high-cost or sensitive operations for human approval.

One gateway, any provider.

Route to OpenAI, Anthropic, Google, Groq, Mistral, and more — all through the same safety pipeline. Bring your own provider keys.

OpenAI

/v1/openai

GPT-4o, GPT-4 Turbo, o1, o3

Anthropic

/v1/anthropic

Claude Opus 4, Sonnet 4, Haiku

Google

/v1/google

Gemini 2.5 Pro, Flash, Ultra

DeepSeek

/v1/deepseek

DeepSeek R1, V3, Coder

Groq/v1/groq

Mistral/v1/mistral

xAI/v1/xai

Moonshot/v1/moonshot

Cerebras/v1/cerebras

Qwen/v1/qwen

Together AI/v1/together

Cohere/v1/cohere

Perplexity/v1/perplexity

Fireworks/v1/fireworks

MiniMax/v1/minimax

Z.AI/v1/zai

OpenRouter/v1/openrouter

CustomAny OpenAI-compatible

quickstart.py

from openai import OpenAI

# Just change the base URL — nothing else
client = OpenAI(
    base_url="https://api.curate-me.ai/v1/openai",
    default_headers={"X-CM-API-Key": "cm_sk_..."},
)

# Every call now goes through the safety pipeline
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}],
)

Everything you need to govern LLM calls.

Cost tracking, rate limiting, PII scanning, and more. All from one dashboard.

Real-Time Cost Dashboard

Per-model, per-org cost breakdowns updated in real time. Redis accumulator for speed, MongoDB audit log for compliance.

Rate Limiting

Per-org and per-key throttling with configurable RPM limits. Automatic 429 responses with retry-after headers.

PII Scanning

Regex-based detection for API keys, credit card numbers, SSNs, and other sensitive data. Deny or redact before it hits the provider.

Model Allowlists

Control which models each organization can access. Wildcard support. Prevent accidental usage of expensive models.

HITL Approvals

High-cost or sensitive requests are held for human review. Approve or deny from the dashboard with full request context.

Full Observability

Every request logged with latency, token count, cost, and governance decisions. Time-travel debugging for any call.

dashboard.curate-me.ai/costs

$12.47

Today+8%

$68.23

This Week-3%

4,219

Requests+12%

Blocked

Cost per Model (Last 7 Days)

gpt-4o

$30.71

claude-opus-4

$19.10

gemini-2.5-pro

$12.28

deepseek-r1

$6.14

Early Access

Govern every LLM call.
Start in five minutes.

Swap one base URL. Get cost control, PII scanning, rate limiting, and human-in-the-loop approvals — instantly.

Bring Your Own Keys·SOC 2 Ready·No Credit Card Required·Self-Hosted Option