Object Definitions & Types

Core Object Definitions

The testing-api utilizes a set of standardized objects to ensure consistency across the Supervised AI platform. These objects represent the core entities involved in the lifecycle of an AI test—from dataset definition to performance evaluation.

TestSet

A TestSet is a curated collection of TestCase objects. It serves as the primary container for benchmarking a specific version of a model or agent.

Example Usage:

{
  "id": "ts_f47ac10b-58cc",
  "name": "Production Sentiment Benchmarks",
  "version": "2.1.0",
  "metadata": {
    "model_type": "LLM",
    "priority": "high"
  }
}

TestCase

A TestCase represents a single unit of evaluation. It defines the input provided to the AI and the expected ground truth or constraints for the output.

Example Usage:

{
  "case_id": "case_001",
  "input_data": {
    "prompt": "Summarize the following text: [Text Content]"
  },
  "expected_output": "This text discusses the impact of AI on writing.",
  "weight": 1.5
}

EvaluationResult

This object is returned after a model processes a TestCase. It contains the actual model response and the platform's calculated performance metrics.

Example Usage:

{
  "case_id": "case_001",
  "actual_output": "The text explores how AI affects the writing process.",
  "status": "passed",
  "score": 0.92,
  "latency_ms": 450
}

Data Types & Constraints

The following standard types are used across all API endpoints to ensure data integrity.

MetricTypes

When configuring evaluations, you must specify the metric type used for scoring the actual_output against the expected_output.

EXACT_MATCH: Binary comparison (1.0 for identical, 0.0 otherwise).
FUZZY_MATCH: String similarity scoring (Levenshtein distance).
SEMANTIC_SIMILARITY: Vector-based similarity using embeddings.
REGEX_VALIDATION: Validates output against a provided pattern.

ModelConfiguration

Used to define the parameters of the model being tested.

Internal Utility Objects

While the following objects are primarily used internally for platform orchestration, they may appear in debug logs or advanced configuration headers.

TraceContext (Internal)

Used to track a single request across the Supervised AI microservices architecture.

Role: Facilitates distributed tracing.
Usage: Should not be modified by the user; however, it can be passed in headers for correlation in enterprise support tickets.

ValidationSchema (Internal)

An internal JSON Schema used to validate incoming TestSet payloads before they are persisted to the database. If a request returns a 422 Unprocessable Entity error, the response will reference the ValidationSchema violation.