Rate limits
The endpoints that carry a rate limit, their exact per-IP limits, and what a 429 looks like.
Rate limiting is applied per client IP on a small set of sensitive endpoints — the auth
flows and the LLM-cost-bearing workflow start. Every other endpoint is currently
unlimited (there is no global default limit). When a limit is exceeded the API returns
429 Too Many Requests.
The limits
| Endpoint | Method | Limit (per IP) | Why |
|---|---|---|---|
/auth/register | POST | 60 / minute | Sign-up abuse protection. |
/auth/login | POST | 5 / minute | Credential-stuffing protection. |
/auth/otp/request | POST | 5 / minute | Throttle one-time-code emails. |
/auth/otp/verify | POST | 10 / minute | Limit code-guessing attempts. |
/workflows/start | POST | 10 / minute | Cap LLM cost — each workflow ≈ $0.07+. |
/auth/otp/request additionally enforces a per-email throttle: it won't issue a second
code while one issued in the last 60 seconds is still pending. /auth/otp/verify also caps
attempts per code (5) independent of the IP limit.
The 429 response
A rate-limited request returns 429 with a detail describing the exceeded limit, for
example:
{ "detail": "Rate limit exceeded: 5 per 1 minute" }Limits are keyed by remote IP address. Behind a proxy or load balancer, make sure the real client IP is forwarded so the limiter counts per caller rather than per gateway.
These are the only endpoints with an enforced limit today. Treat any others as best-effort and don't hammer them — the set above can change as more endpoints are hardened.