Core Data Types

This section defines the fundamental data structures used to interact with the Supervised AI testing API. These types ensure consistency across model evaluations, dataset management, and reporting.

Primitive Types and Enums

`TestStatus`

Represents the current lifecycle state of a testing job.

`MetricType`

Standard identifiers for automated evaluation metrics.

ACCURACY: Percentage of correct predictions.
LATENCY: Response time of the model in milliseconds.
F1_SCORE: Weighted average of precision and recall.
TOKEN_USAGE: Count of tokens consumed (specific to LLMs).

Complex Objects

`TestInput`

The TestInput object represents the payload sent to your model for evaluation.

Example Usage:

{
  "id": "case_001",
  "payload": {
    "prompt": "Summarize the following text...",
    "temperature": 0.7
  },
  "metadata": {
    "tier": "enterprise",
    "region": "us-east-1"
  }
}

`EvaluationFrame`

The bridge between input data and the expected output (Ground Truth). Use this when defining datasets for supervised testing.

`MetricResult`

The output generated by the evaluation engine for a specific metric.

Example Usage:

{
  "name": "LATENCY",
  "value": 145.2,
  "threshold_passed": true
}

API Response Objects

`TestReport`

The top-level object returned when querying the results of a testing suite.

Example Response:

{
  "test_id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "COMPLETED",
  "summary": {
    "total_cases": 100,
    "passed": 98
  },
  "results": [
    {
      "name": "ACCURACY",
      "value": 0.98,
      "threshold_passed": true
    }
  ],
  "created_at": "2023-10-27T10:00:00Z"
}

Internal Components

Note: These are used by the system internally to manage state but may appear in advanced configuration logs.

WorkerNode: Represents the compute instance executing the test logic.
DataBuffer: A transient internal stream used to pipe large datasets from storage to the evaluation engine. Avoid manual instantiation of this type.