Every LLM call governed. Zero code changes.
A 5-step safety pipeline between your agents and their LLM providers. Rate limiting, cost controls, PII scanning, model allowlists, and human-in-the-loop approvals — applied to every request, automatically.

Swap one base URL. That’s it.
Point your existing OpenAI, Anthropic, or Google SDK at our gateway. Every call flows through the safety pipeline before reaching the provider.
OPENAI_BASE_URL=https://api.openai.com/v1OPENAI_BASE_URL=https://api.curate-me.ai/v1/openai
X-CM-API-Key: cm_sk_xxxWorks with any SDK that supports a custom base URL — OpenAI Python, LangChain, LlamaIndex, Vercel AI SDK, and more.
The 5-step safety pipeline.
Every request passes through five safety checks before reaching the LLM provider. Each step can short-circuit and deny the request immediately.
Click any step above to see how it works.
Each step can short-circuit the chain and deny the request immediately. If a request fails rate limiting, it never reaches cost estimation. This keeps latency low and prevents unnecessary work.
Rate Limit
Per-org, per-key request throttling.
Cost Estimate
Estimated cost vs. per-request and daily budget.
PII Scan
Regex scan for secrets and PII in request content.
Model Allowlist
Enforce allowed models per organization.
HITL Gate
Flag high-cost or sensitive operations for human approval.
One gateway, any provider.
Route to OpenAI, Anthropic, Google, Groq, Mistral, and more — all through the same safety pipeline. Bring your own provider keys.
OpenAI
GPT-4o, GPT-4 Turbo, o1, o3
Anthropic
Claude Opus 4, Sonnet 4, Haiku
Gemini 2.5 Pro, Flash, Ultra
DeepSeek
DeepSeek R1, V3, Coder
from openai import OpenAI
# Just change the base URL — nothing else
client = OpenAI(
base_url="https://api.curate-me.ai/v1/openai",
default_headers={"X-CM-API-Key": "cm_sk_..."},
)
# Every call now goes through the safety pipeline
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello!"}],
)Everything you need to govern LLM calls.
Cost tracking, rate limiting, PII scanning, and more. All from one dashboard.
Real-Time Cost Dashboard
Per-model, per-org cost breakdowns updated in real time. Redis accumulator for speed, MongoDB audit log for compliance.
Rate Limiting
Per-org and per-key throttling with configurable RPM limits. Automatic 429 responses with retry-after headers.
PII Scanning
Regex-based detection for API keys, credit card numbers, SSNs, and other sensitive data. Deny or redact before it hits the provider.
Model Allowlists
Control which models each organization can access. Wildcard support. Prevent accidental usage of expensive models.
HITL Approvals
High-cost or sensitive requests are held for human review. Approve or deny from the dashboard with full request context.
Full Observability
Every request logged with latency, token count, cost, and governance decisions. Time-travel debugging for any call.
Govern every LLM call.
Start in five minutes.
Swap one base URL. Get cost control, PII scanning, rate limiting, and human-in-the-loop approvals — instantly.