Introduction to Supervised AI API

Overview

The Supervised AI Testing API serves as the bridge between model development and production readiness. It provides a standardized framework for validating model performance, ensuring data integrity, and automating regression testing within the Supervised AI ecosystem.

This API allows developers to programmatically trigger evaluation pipelines, compare model versions, and retrieve detailed diagnostic metrics. By integrating this API into your CI/CD workflows, you can enforce quality gates that prevent suboptimal models from reaching deployment.

Key Capabilities

Automated Evaluation: Trigger comprehensive test suites against specific datasets or model endpoints.
Performance Benchmarking: Programmatically compare current model outputs against "Gold Standard" datasets.
Regression Testing: Ensure that new iterations of a model maintain accuracy on critical edge cases.
Metric Retrieval: Fetch granular performance data including precision, recall, F1 scores, and custom business logic metrics.

Core Concepts

To effectively use the Testing API, it is important to understand the following primary entities:

Basic Usage

The Testing API follows RESTful principles. Most interactions involve defining a test configuration and submitting it to the execution engine.

Triggering a Test Run

To initiate a new evaluation, send a POST request to the test execution endpoint. You must specify the model identifier and the dataset to be used.

Endpoint: POST /v1/tests/run

Input Parameters:

Example Request:

{
  "model_id": "ner-model-v2.4",
  "suite_id": "production-validation-set",
  "callback_url": "https://api.yourdomain.com/hooks/testing"
}

Retrieving Results

Once a Test Run has moved to a COMPLETED state, you can retrieve the Metric Report to analyze performance.

Endpoint: GET /v1/results/{run_id}

Example Response:

{
  "run_id": "tr_987654321",
  "status": "COMPLETED",
  "summary": {
    "accuracy": 0.942,
    "latency_ms": 120,
    "status": "PASS"
  },
  "detailed_metrics": [
    {
      "category": "edge-cases",
      "score": 0.88,
      "passed": true
    }
  ]
}

Internal Components

While the majority of the Testing API is exposed via the public endpoints described above, the repository contains internal modules responsible for environment isolation and raw log processing.

Test Orchestrator (Internal): Handles the lifecycle of a test execution, from provisioning resources to cleanup.
Result Parser (Internal): Normalizes output from various model types into the standardized JSON format used by the API.

Users do not interact with these components directly, but they are responsible for the consistency and reliability of the data returned by the public interface.