Data Modeling

Overview of Data Entities

In the Supervised AI platform, data modeling is centered around three primary entities: Test Cases, Supervision Results, and Outcomes. The API follows a structured schema to ensure that AI model evaluations are consistent, measurable, and traceable across different versions and environments.

Modeling Supervision Results

A SupervisionResult represents the granular data captured when an AI model's output is evaluated—either by an automated system or a human supervisor. This entity bridges the gap between raw model predictions and the final quality score.

Schema Definition

Example Payload

{
  "result_id": "res_88291_ax",
  "model_version": "gpt-4-turbo-v2",
  "input_payload": {
    "prompt": "Summarize the quarterly earnings report."
  },
  "model_output": {
    "summary": "The company saw a 15% increase in revenue..."
  },
  "score": 0.95,
  "feedback": "Highly accurate summary, captured all key KPIs.",
  "metadata": {
    "processing_time_ms": 450,
    "environment": "staging"
  }
}

Modeling Testing Outcomes

Testing Outcomes aggregate individual supervision results into a high-level report. Use this structure to define whether a specific test suite or deployment gate has met the required threshold for production.

Outcome Status Types

Outcomes are categorized using the following states:

PASSED: All assertions met the defined thresholds.
FAILED: One or more metrics fell below the acceptable limit.
INCONCLUSIVE: Insufficient data or supervision results to determine quality.
PENDING: Supervision is currently in progress (common in human-in-the-loop workflows).

Schema Definition

Example Usage

{
  "test_suite_id": "suite_regression_v4",
  "status": "PASSED",
  "metrics": {
    "total_samples": 1000,
    "average_score": 0.88,
    "failure_rate": 0.02
  },
  "timestamp": "2023-11-01T14:30:00Z"
}

Relationships and Hierarchy

The data model follows a hierarchical structure to maintain data integrity:

Project: The highest level container.
Test Suite: A collection of related test cases.
Supervision Result: The individual evaluation of a model's response to a single test case.
Outcome: The final report generated by aggregating results within a Test Suite.

Best Practices for Data Modeling

Immutable Results: Once a SupervisionResult is submitted, it should be treated as immutable. If a model is re-tested, a new result ID should be generated to preserve the audit trail.
Extensible Metadata: Use the metadata object to store platform-specific information that doesn't fit into the standard schema. This ensures the API remains flexible for different AI domains (NLP, Computer Vision, etc.).
Versioning: Always include the model_version in your data models to allow for side-by-side performance comparisons over time.