Testing Endpoints Catalog

Overview

The Testing API provides a standardized interface for executing evaluations, running regression tests, and benchmarking models within the Supervised AI platform. It allows users to programmatically trigger test suites and retrieve performance metrics.

Authentication

All requests to the Testing API must include a Bearer Token in the authorization header.

Authorization: Bearer <YOUR_ACCESS_TOKEN>

Endpoints Catalog

1. Execute Test Suite

Triggers a predefined test suite against a specific model or deployment.

URL: /v1/test/run
Method: POST
Content-Type: application/json

Request Body

Example Usage

curl -X POST https://api.supervised.ai/v1/test/run \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model_id": "gpt-4-custom-01",
    "test_suite_id": "accuracy-validation-set",
    "parameters": {
      "threshold": 0.85,
      "sample_size": 100
    }
  }'

Response

202 Accepted

{
  "job_id": "test_98765abc",
  "status": "queued",
  "estimated_completion": "2023-10-27T15:00:00Z"
}

2. Get Job Status

Polls the status of a specific test execution job.

URL: /v1/test/status/{job_id}
Method: GET

Parameters

Response

200 OK

{
  "job_id": "test_98765abc",
  "status": "processing",
  "progress": "45%",
  "started_at": "2023-10-27T14:55:00Z"
}

3. Retrieve Test Results

Fetches the detailed metrics and logs once a test job has reached the completed state.

URL: /v1/test/results/{job_id}
Method: GET

Response

200 OK

{
  "job_id": "test_98765abc",
  "summary": {
    "total_cases": 100,
    "passed": 92,
    "failed": 8
  },
  "metrics": {
    "accuracy": 0.92,
    "p99_latency_ms": 450,
    "f1_score": 0.89
  },
  "artifacts_url": "https://storage.supervised.ai/results/test_98765abc.json"
}

4. Evaluate Custom Input

Run a single, ad-hoc test case against a model endpoint for real-time validation.

URL: /v1/eval/single
Method: POST

Request Body

Example Usage

import requests

payload = {
    "model_id": "sentiment-analyzer-v2",
    "input_data": {"text": "This platform is incredibly intuitive."},
    "expected_output": {"label": "positive"}
}

response = requests.post(
    "https://api.supervised.ai/v1/eval/single",
    json=payload,
    headers={"Authorization": "Bearer YOUR_TOKEN"}
)

print(response.json())

Data Objects

Test Status Types

The following strings represent the possible states of a testing job:

queued: Job is waiting for an available runner.
processing: Job is currently executing.
completed: Job finished successfully.
failed: Job encountered a system error (distinct from a "failed test case").
cancelled: Job was manually terminated by a user.

Metric Schema

Standard metrics returned in the results object:

accuracy: (float) Ratio of correct predictions to total.
latency_ms: (integer) Time taken for the model to respond.
cost_estimate: (float) Estimated API cost for the run.