Rate Limiting & Throttling
Overview
To ensure the stability and reliability of the Supervised AI platform, the Testing API implements rate limiting and throttling. These limits prevent resource exhaustion and ensure fair usage across all developers and automated testing suites.
When your requests exceed the allowed threshold, the API will return a 429 Too Many Requests status code.
Rate Limit Tiers
Limits are applied based on your account tier and the specific endpoint being called. The following table outlines the default limits for the Testing API:
| Tier | Requests Per Minute (RPM) | Concurrent Requests | | :--- | :--- | :--- | | Sandbox / Free | 20 | 2 | | Developer | 100 | 5 | | Enterprise | Custom | Custom |
[!NOTE] Rates are calculated on a sliding window basis. If you require higher limits for large-scale load testing, please contact the Supervised AI platform team.
Rate Limit Headers
Every response from the Testing API includes headers that allow you to track your current usage programmatically:
| Header | Description |
| :--- | :--- |
| X-RateLimit-Limit | The maximum number of requests allowed in the current window. |
| X-RateLimit-Remaining | The number of requests remaining in the current window. |
| X-RateLimit-Reset | The time (in UTC Epoch seconds) when the current rate limit window resets. |
Handling 429 Responses
When you exceed your quota, the API will stop processing requests and return a JSON error response.
Example Error Response
{
"status": 429,
"error": "Too Many Requests",
"message": "Rate limit exceeded. Please try again in 15 seconds.",
"retry_after": 15
}
Best Practices for Throttling
To handle rate limits effectively, we recommend implementing the following strategies in your testing client:
- Check Headers: Monitor the
X-RateLimit-Remainingheader to proactively slow down requests before hitting the limit. - Exponential Backoff: If you receive a 429 error, wait for the duration specified in the
retry_afterfield (or theX-RateLimit-Resettime) before attempting the request again. - Request Pooling: Use a queue system to manage high volumes of testing requests, ensuring the dispatch rate remains within your tier's limits.
Implementation Example (Node.js)
async function fetchWithRetry(url, options) {
let response = await fetch(url, options);
if (response.status === 429) {
const retryAfter = response.headers.get('Retry-After') || 5;
console.warn(`Rate limit hit. Retrying after ${retryAfter} seconds...`);
await new Promise(resolve => setTimeout(resolve, retryAfter * 1000));
return fetchWithRetry(url, options);
}
return response.json();
}