Secure, zero-retention API proxy for Gemini with enterprise-grade protection against abuse and key compromise.
Mailroom Agent Desktop
Validates, limits, forwards
AI Model API
Keys, limits, usage
Key management
Every API request goes through multiple validation layers before reaching Gemini.
Request arrives with API key in header. Key is hashed and looked up in D1. Invalid keys are rejected immediately.
Check requests-per-minute limit for this key. If exceeded, return 429 with retry-after header. Protects against runaway scripts.
Verify monthly token/request quota hasn't been exceeded. If over limit, return 402 Payment Required.
Request is forwarded to Gemini using our master API key. Response is streamed back to customer. No content is logged.
After response completes, usage counters are incremented asynchronously (non-blocking). Only metadata stored: key_id, timestamp, token_count.
Multiple safeguards ensure a compromised key can't cause significant damage.
Each key has a maximum requests-per-minute. Prevents scripts from hammering the API. Automatically throttles without blocking legitimate use.
Hard limit on tokens/requests per billing period. Once reached, key stops working until reset. Customer sees clear usage in dashboard.
Sudden usage spikes trigger alerts. If a key goes from 10 req/day to 1000 req/hour, we're notified and can investigate or pause.
Any key can be revoked instantly from admin dashboard. Takes effect immediately - no propagation delay. Compromised key = dead key in seconds.
Generate new key while old key remains valid for grace period. Allows seamless rotation without downtime. Old key auto-expires.
If total spend across all keys exceeds daily threshold, all non-essential keys pause. Protects us from coordinated attacks or billing surprises.
Instead of hard blocking, we progressively slow down requests as usage increases. This provides a better user experience while still protecting resources.
| Usage Level | Request Delay | Effect |
|---|---|---|
| 0-50% of limit | 0ms (instant) | Normal operation, no throttling |
| 50-75% of limit | 500ms delay | Gentle slowdown, still usable |
| 75-90% of limit | 2 second delay | Noticeable slowdown, discourages heavy use |
| 90-100% of limit | 5 second delay | Heavy throttling, warning territory |
| 100%+ of limit | Blocked (429) | Hard stop until reset |
Blocking disrupts legitimate workflows. Throttling lets work continue while naturally limiting abuse. A stolen key running a script will be painfully slow.
Scripts trying to abuse a stolen key will face 5+ second delays per request. What would take seconds takes hours. Not worth the effort.
These limits apply per API key. Throttling kicks in before hard limits. Can be customized per subscription tier.
| Limit Type | Default Value | Purpose |
|---|---|---|
| Requests per minute | 60 RPM | Throttling starts at 30 RPM, blocked at 60 |
| Requests per day | 5,000 RPD | Throttling starts at 2,500 RPD |
| Tokens per month | 10M tokens | Monthly quota tied to subscription tier |
| Max tokens per request | 32,000 | Prevents single massive requests |
| Concurrent requests | 10 | Queue additional requests with delay |
Customers use their Mailroom Agent API key to access Gemini through our proxy.
We never store request or response content. Only metadata required for billing and security.
Prompt content, document text, AI responses, file contents, user data in requests.
Key ID, timestamp, token count, HTTP status, response time (for billing and debugging).
Even in the worst case, damage is limited by design.
| Scenario | Maximum Damage | Mitigation |
|---|---|---|
| Key stolen, used immediately | 60 requests before throttle | Rate limit kicks in within 1 minute |
| Key stolen, used over time | 5,000 requests/day max | Daily cap prevents extended abuse |
| Key stolen, billing impact | Monthly quota only | Can't exceed subscription tier limits |
| Key stolen, we're notified | Revoked in <1 minute | Anomaly alerts + instant revocation |