Validation Patterns

Validation patterns in the testing-api ensure that AI model outputs consistently adhere to the Supervised AI platform's architectural standards. By using these patterns, you can catch malformed responses, data type mismatches, and schema violations before they propagate to downstream evaluation metrics.

Standard Schema Validation

The primary validation pattern involves checking a raw model output against the defined API schema. This ensures that required fields—such as prediction, confidence_score, and metadata—are present and correctly typed.

from testing_api.validators import SchemaValidator
from testing_api.models import ModelOutput

# Example raw output from an LLM or Vision Model
raw_output = {
    "prediction": "The category is 'Finance'",
    "confidence_score": 0.92,
    "metadata": {"latency_ms": 150}
}

# Validate against the standard ModelOutput schema
validator = SchemaValidator(ModelOutput)
is_valid, errors = validator.validate(raw_output)

if not is_valid:
    print(f"Validation failed: {errors}")

Response Type Patterns

When validating model outputs, use the following patterns based on the expected response type:

1. Categorical Validation

Used for classification tasks where the output must exist within a predefined set of labels.

2. Generative Validation

Used for RAG (Retrieval-Augmented Generation) or summarization tasks where the structure of the text is more important than specific labels.

Handling Validation Errors

The testing-api provides a structured error response when validation fails. It is recommended to wrap your inference calls in a validation block to handle these gracefully.

try:
    validated_data = ModelOutput(**raw_model_response)
except ValidationError as e:
    # Log the schema mismatch for the specific test case
    log_test_failure(test_id="case_001", error=e.json())

Custom Validation Rules

For advanced testing scenarios, you can extend the base validation patterns to include heuristic-based checks, such as minimum character counts or prohibited word lists.

from testing_api.validators import BaseValidator

class QualityValidator(BaseValidator):
    def check_min_length(self, output: str, min_length: int = 10):
        return len(output) >= min_length

# Usage
custom_val = QualityValidator()
if not custom_val.check_min_length(raw_output["prediction"]):
    raise ValueError("Model output failed quality check: Content too short.")

Summary of Validation States