Core Data Objects

Overview

The testing-api utilizes a set of standardized data objects to ensure consistency across the Supervised AI platform. These objects define how test data is structured, how models are evaluated, and how results are reported.

TestSet

The TestSet is the primary container for evaluation data. It bundles individual test cases with metadata required for the platform to identify the versioning and purpose of the evaluation.

Usage Example:

{
  "id": "ts_98765",
  "name": "Customer Intent Classification - Production",
  "version": "2.1.0",
  "cases": [...]
}

TestCase

The TestCase represents a single unit of evaluation. It contains the input payload sent to the AI model and the "Ground Truth" used for validation.

Usage Example:

{
  "caseId": "case_001",
  "input": {
    "text": "How do I reset my password?"
  },
  "expectedOutput": "account_security_intent",
  "metadata": {
    "category": "security",
    "priority": 1
  }
}

ModelResponse

When a model is invoked via the testing API, it must return a ModelResponse. This object normalizes the output of various AI models into a format the evaluation engine can parse.

Usage Example:

const response: ModelResponse = {
  rawOutput: { choices: [{ message: "The user wants to reset password" }] },
  parsedOutput: "account_security_intent",
  latencyMs: 145,
  tokensUsed: 42
};

EvaluationResult

After running a TestCase, the API generates an EvaluationResult. This object determines the success or failure of the test based on defined metrics.

Usage Example:

{
  "caseId": "case_001",
  "status": "PASSED",
  "score": 1.0,
  "reason": "Exact match found between parsed output and expected intent."
}

MetricConfiguration

The MetricConfiguration object allows users to define how the API should calculate "correctness" for a TestSuite.

Usage Example:

{
  "metricType": "COSINE_SIMILARITY",
  "threshold": 0.85,
  "caseSensitive": false
}