REST Endpoints Inventory

The Supervised AI testing API provides a programmatic interface to manage datasets, initiate evaluation runs, and retrieve performance metrics for AI models. All requests must be sent over HTTPS and authenticated using your platform API key.

Base URL

All API requests are relative to the following base URL: https://api.supervised.ai/v1/testing

Authentication

Requests require a Bearer Token in the Authorization header.

Authorization: Bearer <YOUR_API_KEY>

Evaluation Management

Endpoints used to trigger and monitor AI model evaluations against specified test suites.

Start Evaluation Run

Initiates a new testing session for a specific model against a dataset.

Endpoint: POST /evaluations/run
Request Body: | Field | Type | Required | Description | | :--- | :--- | :--- | :--- | | model_id | string | Yes | The unique identifier of the AI model to test. | | dataset_id | string | Yes | The ID of the test dataset to use. | | config | object | No | Optional parameters like temperature or max_tokens. |
Success Response (202 Accepted):

{
  "run_id": "eval_88234-x9",
  "status": "queued",
  "estimated_completion": "2023-11-01T14:30:00Z"
}

Get Evaluation Status

Retrieves the real-time progress of a specific evaluation run.

Endpoint: GET /evaluations/{run_id}/status
Path Parameters:
- run_id (string): The ID returned when the run was initiated.
Success Response (200 OK):

{
  "run_id": "eval_88234-x9",
  "status": "processing",
  "progress": 65,
  "completed_samples": 130,
  "total_samples": 200
}

Dataset Management

Endpoints for managing the ground-truth data used during the testing process.

List Datasets

Returns a paginated list of all test datasets available in the workspace.

Endpoint: GET /datasets
Query Parameters:
- limit (integer): Number of items to return (default: 20).
- offset (integer): Pagination offset.
Success Response (200 OK):

{
  "datasets": [
    {
      "id": "ds_qa_01",
      "name": "General QA Benchmark",
      "sample_count": 500,
      "created_at": "2023-10-15T09:00:00Z"
    }
  ]
}

Upload Test Dataset

Uploads a new set of test cases (JSON or CSV format).

Endpoint: POST /datasets/upload
Content-Type: multipart/form-data
Request Params:
- file: The dataset file.
- name: (string) The display name for the dataset.

Results and Analytics

Endpoints to retrieve the output of completed tests and performance benchmarks.

Get Evaluation Results

Fetches the detailed performance metrics (Accuracy, Precision, Recall, Latency) for a completed run.

Endpoint: GET /results/{run_id}
Success Response (200 OK):

{
  "run_id": "eval_88234-x9",
  "metrics": {
    "accuracy": 0.94,
    "f1_score": 0.92,
    "avg_latency_ms": 450
  },
  "summary": "Model performed above threshold for 90% of samples."
}

Export Results

Generates a downloadable report of the test results.

Endpoint: GET /results/{run_id}/export
Query Parameters:
- format: csv or pdf.
Success Response (200 OK): Returns a binary file stream.

Internal Utilities

Note: These endpoints are intended for internal platform synchronization and may be subject to stricter rate limits.

System Health

Checks the connectivity of the testing engine.

Endpoint: GET /health
Role: Monitoring and load balancing.

Internal Cache Purge

Clears temporary evaluation artifacts.

Endpoint: POST /internal/cache/clear
Role: Maintenance of storage resources between large test batches.