virtual_keys: - id: vk_acme_prod customer_id: acme_corp budget: max_per_month_usd: 12000 reset_duration: monthly rate_limit: requests_per_minute: 600 allowed_providers: - openai - anthropic - bedrock fallbacks: - provider: openai model: gpt-4o - provider: anthropic model: claude-sonnet-4-6 - provider: bedrock model: anthropic.claude-sonnet-4-6
virtual_keys: - id: vk_acme_prod customer_id: acme_corp budget: max_per_month_usd: 12000 reset_duration: monthly rate_limit: requests_per_minute: 600 allowed_providers: - openai - anthropic - bedrock fallbacks: - provider: openai model: gpt-4o - provider: anthropic model: claude-sonnet-4-6 - provider: bedrock model: anthropic.claude-sonnet-4-6
virtual_keys: - id: vk_acme_prod customer_id: acme_corp budget: max_per_month_usd: 12000 reset_duration: monthly rate_limit: requests_per_minute: 600 allowed_providers: - openai - anthropic - bedrock fallbacks: - provider: openai model: gpt-4o - provider: anthropic model: claude-sonnet-4-6 - provider: bedrock model: anthropic.claude-sonnet-4-6 - Per-customer spend caps that don't require a deploy to update.
- Provider failover that survives Anthropic going down for 23 minutes (it did, last March).
- Cost data we don't have to reconstruct from CloudWatch logs. - 11,247 LOC in gateway_middleware/
- p95 added latency from middleware: 47ms
- Mean time to add a new model: 2 days (testing, rollout, monitoring) - 4,108 LOC remaining (mostly business logic we still need)
- p95 added latency from Bifrost in front: 8ms
- Mean time to add a new model: under an hour - Bifrost virtual keys docs
- Budget management hierarchy
- Bifrost GitHub repo
- LiteLLM proxy docs (worth comparing)
- Drop-in replacement notes