Data Models

Core Entity Models

The testing-api uses a structured set of data models to ensure consistency across the Supervised AI platform. These entities define how tests are configured, how datasets are structured, and how evaluation results are reported.

Evaluation

The Evaluation model is the primary object representing a testing session. It encapsulates the configuration, the target model details, and the dataset being used for the assessment.

{
  "id": "eval_882910",
  "name": "LLM-Sentiment-Benchmark-v1",
  "status": "COMPLETED",
  "config": {
    "threshold": 0.85,
    "retryCount": 3
  }
}

Dataset

Datasets represent the collection of inputs and expected outputs (ground truth) used to validate model performance.

DataEntry Structure:

interface DataEntry {
  id: string;
  input: Record<string, any>;     // The prompt or raw data
  expectedOutput?: any;           // Ground truth for comparison
  metadata?: Record<string, any>; // Optional context (tags, difficulty, etc.)
}

ModelConfiguration

This model defines the interface for the AI model being tested. It specifies the endpoint, parameters, and authentication required to invoke the model during the evaluation process.

{
  "provider": "custom",
  "modelName": "supervised-ai-finetune-v2",
  "parameters": {
    "temperature": 0.2,
    "max_tokens": 512
  }
}

TestResult

The TestResult object is generated for every individual DataEntry processed during an evaluation. It maps the model's actual response against the expected output.

{
  "testCaseId": "case_001",
  "actualOutput": "The sentiment is positive.",
  "scores": {
    "exact_match": 1.0,
    "semantic_similarity": 0.98,
    "latency_ms": 145
  }
}

EvaluationSummary

A high-level aggregation of results once an evaluation process is finished. This model is typically used for dashboarding and reporting within the Supervised AI platform.