Schema Design Principles
Core Philosophy
The testing-api schema is built to serve as the backbone for the Supervised AI platform's evaluation and benchmarking suite. Its primary objective is to bridge the gap between human-led quality assurance and automated AI evaluation. To achieve this, the schema follows three primary pillars: Strict Consistency, Semantic Clarity, and Machine Interpretability.
1. Semantic AI-Readability
Since this API often feeds into LLM-based evaluators or automated grading scripts, the schema avoids cryptic abbreviations. Keys are descriptive and context-rich to ensure that an AI model can parse the intent of a field without additional documentation.
- Descriptive Keys: Use
expected_outputinstead ofexp_out. - Contextual Metadata: Every test object includes a
metadatablock to store environmental variables (model version, temperature, prompt ID) that influence the outcome.
2. Structural Consistency
To ensure seamless integration across different modules of the Supervised AI platform, the schema enforces a predictable hierarchy. This consistency allows developers to write generic parsers that work across various test suites.
- Flat Hierarchies: Where possible, we favor flat structures over deeply nested objects. This reduces parsing complexity and makes it easier for AI models to attend to all relevant fields.
- Uniform Error Envelopes: All validation failures and API errors follow a standardized format, providing a
code,message, andpathto the offending field.
3. Strong Typing and Validation
The schema utilizes strict typing to prevent "hallucinations" in data entry and to ensure that programmatic evaluations are performed on valid data sets.
- Enum Enforcement: Fields such as
statusorevaluation_methoduse strict Enums to prevent data fragmentation. - Strict Null Handling: We explicitly define which fields are
nullable. In a testing context, an empty string is often different from anullvalue (missing data), and our schema respects this distinction.
4. Extensibility via Metadata
While the core schema is rigid to ensure stability, we provide a custom_properties or metadata object in every primary entity. This allows users to attach platform-specific or experiment-specific data without breaking the standard API interface.
{
"test_id": "eval_01HGP",
"input_data": {
"prompt": "Summarize the following text...",
"context": "..."
},
"expected_output": "A concise three-sentence summary.",
"metadata": {
"version": "1.0.4",
"tags": ["regression", "summarization-v2"],
"model_parameters": {
"temperature": 0.7,
"top_p": 1.0
}
}
}
5. Versioning and Compatibility
The schema version is included in the payload or the request header. We follow semantic versioning for the API structure:
- Major updates: Introduce breaking changes to the field requirements.
- Minor updates: Add optional fields that enhance AI-readability or provide more context.
- Patch updates: Clarify documentation or refine validation regex without changing the data model.