Mock Data Generation

Overview

The testing-api provides a suite of utilities designed to generate synthetic data that mirrors the Supervised AI platform's data structures. These tools allow developers to simulate edge cases, perform load testing, and validate UI components without relying on production databases or manual data entry.

Mock Data Factory

The primary interface for creating synthetic records is the MockFactory class. It allows for the generation of single entities or large batches based on predefined schemas that align with the platform's core entities.

Basic Usage

To generate a single mock object, use the create() method. To generate multiple records, use the create_batch() method.

from testing_api import MockFactory

# Generate a single mock user profile
user = MockFactory.create(schema="user")

# Generate 50 mock dataset entries
datasets = MockFactory.create_batch(schema="dataset", count=50)

Supported Data Schemas

The generator supports several specific schemas relevant to the Supervised AI ecosystem:

API Reference

`MockFactory.create(schema, overrides=None)`

Generates a single dictionary representing a platform entity.

Parameters:
- schema (string): The identifier for the data structure (e.g., "user").
- overrides (dict, optional): Specific key-value pairs to overwrite default generated values.
Returns: dict - A populated data object.

`MockFactory.create_batch(schema, count, seed=None)`

Generates a list of mock entities.

Parameters:
- schema (string): The identifier for the data structure.
- count (int): Number of records to generate.
- seed (int, optional): A seed value to ensure deterministic, reproducible output for automated testing.
Returns: list[dict] - A list of populated data objects.

Deterministic Data Generation

For regression testing, it is often necessary to generate the same "random" data across multiple test runs. You can provide a seed to the generator to ensure consistency.

# This will always produce the same set of data
stable_data = MockFactory.create_batch(
    schema="model_output", 
    count=10, 
    seed=42
)

Customizing Mock Data

If the default schema does not cover a specific test scenario, use the overrides parameter to inject specific values while keeping the rest of the object randomized.

# Create a user specifically with an 'Admin' role
admin_user = MockFactory.create(
    schema="user", 
    overrides={"role": "admin", "is_verified": True}
)

Internal Providers

While the MockFactory is the public entry point, it utilizes internal Data Providers to map fields to specific data types (e.g., standardizing how UUIDs or timestamps are formatted across the API). Users generally do not need to interact with these providers directly, as they are managed via the schema argument.