Defining Schemas
A schema defines the structure of your dataset — the fields, their types, and optional descriptions that guide AI generation. You build and edit schemas directly in the visual schema editor.
The Schema Editor
Open any dataset and click the Schema tab to access the schema editor. Each row in the editor represents a field with three columns:
| Column | Required | Description |
|---|---|---|
| Name | Yes | Field identifier (alphanumeric + underscores, e.g., user_id, email) |
| Type | Yes | Data type — selected from a dropdown |
| Description | No | Semantic hint that guides AI generation (e.g., "Business email address") |
Adding Fields
- Open the Schema tab on your dataset
- Click Add Field
- Enter a field name (e.g.,
customer_email) - Select a type from the dropdown
- Optionally add a description to guide AI generation
- Repeat for additional fields
- Click Save Schema
Descriptions make a huge difference in generation quality. A field named status with no description produces generic values like "active" or "pending". Add a description like "Order fulfillment status: pending, processing, shipped, delivered, cancelled" and the AI generates domain-specific values.
Reordering and Removing Fields
- Drag a field row by its handle to reorder
- Click the trash icon on a field row to remove it
- Click Save Schema to persist changes
Supported Field Types
String
Text data of any length. Use descriptions to hint at the format.
Examples: email addresses, company names, URLs, status values
Number
Numeric values — integers or floats.
Examples: revenue figures, counts, scores
Integer
Whole numbers only.
Examples: user IDs, ages, quantities
Float
Floating-point numbers with decimal precision.
Examples: prices, coordinates, percentages
Boolean
True/false values.
Examples: is_active, email_verified, has_subscription
Date
Calendar dates in ISO 8601 format (YYYY-MM-DD).
Examples: birth dates, contract start dates, expiration dates
Datetime
Timestamps with date and time in ISO 8601 format.
Examples: created_at, last_login, order_placed_at
Array
Ordered lists of values. Use the description to specify what the array contains.
Examples: tags, phone numbers, product categories
Object
Nested JSON objects. Describe the expected structure in the description field.
Examples: addresses, user preferences, nested metadata
JSON
Arbitrary JSON structures for flexible, untyped data.
Examples: custom metadata, configuration blobs, plugin data
Schema Design Tips
Use Descriptive Field Names
Field names guide AI generation. Semantic names produce better data:
| Approach | Field Name | AI Output Quality |
|---|---|---|
| Good | customer_email | Realistic business emails |
| Good | monthly_recurring_revenue | Plausible MRR values |
| Poor | field1 | Random strings |
| Poor | value | Uncontextualized numbers |
Match Your API's Response Shape
Design schemas that mirror the API responses you want to mock. If your real API returns:
{
"user_id": 123,
"username": "alice",
"created_at": "2026-03-21T10:30:00Z",
"is_verified": true
}
Create fields: user_id (integer), username (string), created_at (datetime), is_verified (boolean).
Use Objects for Nested Data
For related fields, use the object type with a description of the structure:
- Field name:
billing_address - Type: object
- Description: "Billing address with street, city, state, zip, country"
The AI generates a realistic nested object matching the description.
Updating a Schema
- Open the Schema tab on your dataset
- Edit field names, types, or descriptions inline
- Add new fields with Add Field or remove fields with the trash icon
- Click Save Schema
Updating a schema does not change existing versions. Previously generated data remains as-is. Generate new data to apply the updated schema.
Common Schema Patterns
User/Customer Data
| Field | Type | Description |
|---|---|---|
id | integer | |
first_name | string | |
last_name | string | |
email | string | Email address |
phone | string | Phone number |
created_at | datetime | |
is_active | boolean |
Order/Transaction Data
| Field | Type | Description |
|---|---|---|
order_id | string | |
customer_id | integer | |
total_amount | float | Total in USD |
status | string | pending, paid, shipped, delivered, cancelled |
items | array | Array of order line items |
created_at | datetime |
Product Catalog Data
| Field | Type | Description |
|---|---|---|
product_id | string | |
name | string | Product name |
description | string | Product description |
price | float | Price in USD |
category | string | Product category |
tags | array | Product tags |
in_stock | boolean |
Next Steps
- Generation — generate data from your schema
- Scenarios — create schema variants for test cases
- Datasets — manage datasets and metadata