Skip to main content

Defining Schemas

A schema defines the structure of your dataset — the fields, their types, and optional descriptions that guide AI generation. You build and edit schemas directly in the visual schema editor.

The Schema Editor

Open any dataset and click the Schema tab to access the schema editor. Each row in the editor represents a field with three columns:

ColumnRequiredDescription
NameYesField identifier (alphanumeric + underscores, e.g., user_id, email)
TypeYesData type — selected from a dropdown
DescriptionNoSemantic hint that guides AI generation (e.g., "Business email address")

Adding Fields

  1. Open the Schema tab on your dataset
  2. Click Add Field
  3. Enter a field name (e.g., customer_email)
  4. Select a type from the dropdown
  5. Optionally add a description to guide AI generation
  6. Repeat for additional fields
  7. Click Save Schema
tip

Descriptions make a huge difference in generation quality. A field named status with no description produces generic values like "active" or "pending". Add a description like "Order fulfillment status: pending, processing, shipped, delivered, cancelled" and the AI generates domain-specific values.

Reordering and Removing Fields

  • Drag a field row by its handle to reorder
  • Click the trash icon on a field row to remove it
  • Click Save Schema to persist changes

Supported Field Types

String

Text data of any length. Use descriptions to hint at the format.

Examples: email addresses, company names, URLs, status values

Number

Numeric values — integers or floats.

Examples: revenue figures, counts, scores

Integer

Whole numbers only.

Examples: user IDs, ages, quantities

Float

Floating-point numbers with decimal precision.

Examples: prices, coordinates, percentages

Boolean

True/false values.

Examples: is_active, email_verified, has_subscription

Date

Calendar dates in ISO 8601 format (YYYY-MM-DD).

Examples: birth dates, contract start dates, expiration dates

Datetime

Timestamps with date and time in ISO 8601 format.

Examples: created_at, last_login, order_placed_at

Array

Ordered lists of values. Use the description to specify what the array contains.

Examples: tags, phone numbers, product categories

Object

Nested JSON objects. Describe the expected structure in the description field.

Examples: addresses, user preferences, nested metadata

JSON

Arbitrary JSON structures for flexible, untyped data.

Examples: custom metadata, configuration blobs, plugin data

Schema Design Tips

Use Descriptive Field Names

Field names guide AI generation. Semantic names produce better data:

ApproachField NameAI Output Quality
Goodcustomer_emailRealistic business emails
Goodmonthly_recurring_revenuePlausible MRR values
Poorfield1Random strings
PoorvalueUncontextualized numbers

Match Your API's Response Shape

Design schemas that mirror the API responses you want to mock. If your real API returns:

{
"user_id": 123,
"username": "alice",
"created_at": "2026-03-21T10:30:00Z",
"is_verified": true
}

Create fields: user_id (integer), username (string), created_at (datetime), is_verified (boolean).

Use Objects for Nested Data

For related fields, use the object type with a description of the structure:

  • Field name: billing_address
  • Type: object
  • Description: "Billing address with street, city, state, zip, country"

The AI generates a realistic nested object matching the description.

Updating a Schema

  1. Open the Schema tab on your dataset
  2. Edit field names, types, or descriptions inline
  3. Add new fields with Add Field or remove fields with the trash icon
  4. Click Save Schema
info

Updating a schema does not change existing versions. Previously generated data remains as-is. Generate new data to apply the updated schema.

Common Schema Patterns

User/Customer Data

FieldTypeDescription
idinteger
first_namestring
last_namestring
emailstringEmail address
phonestringPhone number
created_atdatetime
is_activeboolean

Order/Transaction Data

FieldTypeDescription
order_idstring
customer_idinteger
total_amountfloatTotal in USD
statusstringpending, paid, shipped, delivered, cancelled
itemsarrayArray of order line items
created_atdatetime

Product Catalog Data

FieldTypeDescription
product_idstring
namestringProduct name
descriptionstringProduct description
pricefloatPrice in USD
categorystringProduct category
tagsarrayProduct tags
in_stockboolean

Next Steps

  • Generation — generate data from your schema
  • Scenarios — create schema variants for test cases
  • Datasets — manage datasets and metadata