Aller au contenu principal

Quota Enforcement

STOA enforces per-consumer API quotas at the gateway level — rate limits (per-second/minute), daily caps, and monthly limits. Quotas are defined in subscription plans and enforced in real-time.

Enforcement Pipeline

Every API request passes through the quota check after authentication:

Quota Types

TypeWindowExampleUse Case
Per-secondRolling 1s100 req/sBurst protection
Per-minuteRolling 1min1,000 req/minSteady-state rate limiting
DailyMidnight UTC reset100,000 req/dayDaily usage caps
Monthly1st of month reset1,000,000 req/monthBilling-period limits
BurstInstant50 concurrentConcurrent request cap

Plan Configuration

Quotas are defined per plan in the Control Plane:

Configure your environment

The examples below use environment variables. Set them for your STOA instance:

export STOA_API_URL="https://api.gostoa.dev"       # Replace with your domain
export STOA_AUTH_URL="https://auth.gostoa.dev" # Keycloak OIDC provider
export STOA_GATEWAY_URL="https://mcp.gostoa.dev" # MCP Gateway endpoint

Self-hosted? Replace gostoa.dev with your domain.

curl -X POST "${STOA_API_URL}/v1/plans" \
-H "Authorization: Bearer ${TOKEN}" \
-H "Content-Type: application/json" \
-d '{
"slug": "gold",
"name": "Gold Plan",
"rate_limit_per_second": 100,
"rate_limit_per_minute": 5000,
"daily_request_limit": 500000,
"monthly_request_limit": 10000000,
"burst_limit": 50
}'

Example Plans

PlanRate/secRate/minDailyMonthly
Community56010,000100,000
Silver2030050,0001,000,000
Gold1005,000500,00010,000,000
EnterpriseCustomCustomCustomCustom

Response Headers

Every successful response includes rate limit headers:

X-RateLimit-Limit: 5000
X-RateLimit-Remaining: 4523
X-RateLimit-Reset: 1708000123
HeaderDescription
X-RateLimit-LimitMaximum requests allowed in the current window
X-RateLimit-RemainingRequests remaining in the current window
X-RateLimit-ResetUnix timestamp when the window resets

Error Response (429)

When a quota is exceeded, the gateway returns 429 Too Many Requests:

Rate limit exceeded:

{
"error": "quota_exceeded",
"message": "Rate limit exceeded: per_minute limit of 100 requests reached",
"retry_after_secs": 45
}

Daily quota exceeded:

{
"error": "quota_exceeded",
"message": "Daily quota exceeded: 10000/10000 requests used. Resets at midnight UTC.",
"retry_after_secs": null
}

The Retry-After header is also set:

HTTP/1.1 429 Too Many Requests
Retry-After: 45
Content-Type: application/json

Reset Behavior

Quota TypeResets When
Per-secondAfter 1 second
Per-minuteAfter 60 seconds
DailyMidnight UTC (00:00 UTC)
Monthly1st of the month (00:00 UTC)

Resets are automatic — no manual intervention needed. The gateway checks the current date/time on every request and resets counters when the window changes.

Monitoring Quotas

Via Admin API

# List all consumer quota statistics
curl "${STOA_GATEWAY_URL}/admin/quotas" \
-H "Authorization: Bearer ${ADMIN_TOKEN}"
[
{
"consumer_id": "user-123",
"daily_count": 450,
"daily_limit": 1000,
"monthly_count": 12500,
"monthly_limit": 50000,
"daily_remaining": 550,
"monthly_remaining": 37500
}
]

Per-Consumer Stats

curl "${STOA_GATEWAY_URL}/admin/quotas/${CONSUMER_ID}" \
-H "Authorization: Bearer ${ADMIN_TOKEN}"

Reset Quotas (Admin)

For troubleshooting or customer support, admins can reset a consumer's quota counters:

curl -X POST "${STOA_GATEWAY_URL}/admin/quotas/${CONSUMER_ID}/reset" \
-H "Authorization: Bearer ${ADMIN_TOKEN}"

This resets both daily and monthly counters to zero immediately.

Prometheus Metrics

The gateway exposes quota-related Prometheus metrics:

MetricTypeDescription
stoa_requests_totalCounterTotal requests (by consumer, status)
stoa_rate_limited_totalCounterRequests rejected by rate limiter
stoa_quota_exceeded_totalCounterRequests rejected by daily/monthly quota
stoa_quota_usage_ratioGaugeCurrent usage as ratio (0.0-1.0)

Grafana Alert Example

# Alert when a consumer reaches 80% of daily quota
- alert: QuotaNearLimit
expr: stoa_quota_usage_ratio{type="daily"} > 0.8
for: 5m
labels:
severity: warning
annotations:
summary: "Consumer {{ $labels.consumer_id }} at {{ $value | humanizePercentage }} of daily quota"

Default Quotas

When no plan is specified, the gateway applies default limits:

SettingDefault Value
Rate per minute5 requests
Daily limit10,000 requests

These defaults protect against abuse for consumers without an explicit plan. Configure higher limits by assigning a plan to the subscription.

Troubleshooting

ProblemCauseFix
429 immediately on first requestDefault quota too low (5/min)Assign a plan to the subscription
Quota not resetting at midnightTime zone mismatchQuotas reset at midnight UTC
Consumer shows 0 remaining but requests workCache delayWait up to 5 minutes or clear cache
Admin quota reset doesn't take effectGateway cacheClear cache after reset: POST /admin/cache/clear