Object Data Models

The testing-api utilizes a structured data hierarchy to manage the lifecycle of AI model evaluation. Understanding these core entities—Test Cases, Sessions, and Predictions—is essential for integrating with the Supervised AI testing platform.

Test Case

A Test Case represents the foundational unit of testing. It defines the specific input provided to an AI model and the ground truth (expected output) used for validation.

Schema

Example

{
  "id": "tc_001",
  "input_data": {
    "prompt": "Translate 'Hello' to French"
  },
  "expected_output": "Bonjour",
  "tags": ["translation", "v1-baseline"]
}

Session

A Session is a logical grouping of test executions. It tracks the context of a specific test run, such as which model version is being evaluated and the environment in which the tests are occurring.

Schema

Example

{
  "session_id": "sess_88291",
  "model_id": "gpt-4-turbo",
  "environment": "staging",
  "status": "completed",
  "created_at": "2023-10-27T10:00:00Z"
}

Prediction

A Prediction is the output generated by the model for a specific Test Case within a Session. It serves as the bridge between raw model output and the final evaluation metrics.

Schema

Example

{
  "prediction_id": "pred_4452",
  "test_case_id": "tc_001",
  "session_id": "sess_88291",
  "actual_output": "Bonjour",
  "latency_ms": 120,
  "metrics": {
    "exact_match": true,
    "confidence": 0.98
  }
}

Relationships

Session (1:N) Prediction: A single session contains multiple predictions.
Test Case (1:N) Prediction: A single test case can be reused across multiple sessions, resulting in multiple prediction records over time.