guide6 min read

Data Dictionary Example: A Real-World Template You Can Copy

Data Dictionary Example: A Real-World Template You Can Copy Today

A data dictionary example shows how to document the tables, columns, data types, business definitions, and ownership for a real dataset. The best example uses a concrete business scenario — like an ecommerce 'orders' table — with every column defined in plain English plus technical metadata.

This guide provides a full working data dictionary example for an ecommerce stack, plus the template you can adapt for your own warehouse. It is filled in with realistic values so you can see what good looks like, not a blank skeleton you still have to figure out.

Unlike generic blank templates, the example here is filled in with realistic values so you can see what good looks like. We also cover automation — how to generate data dictionaries from catalog metadata instead of maintaining them by hand.

What Goes in a Data Dictionary

Every data dictionary row should contain these columns:

  • Table name — Full qualified name including database and schema
  • Column name — As it exists in the warehouse
  • Data type — Snowflake/BigQuery/Postgres native type
  • Nullable — Can this column be NULL?
  • Description — Business definition in plain English
  • Example value — A realistic sample
  • Source — Upstream system that populates this column
  • PII classification — None / PII / SPI / Restricted
  • Owner — Person or team accountable
  • Last updated — When the definition was last reviewed

Example: Ecommerce Orders Table Data Dictionary

ColumnTypeDescriptionPII
order_idVARCHAR(36)Unique order identifier, UUID v4None
customer_idVARCHAR(36)FK to customers.id, identifies the buyerIndirect PII
order_placed_atTIMESTAMP_TZWhen the customer clicked 'Place Order' in UTCNone
order_statusVARCHAR(20)One of: pending, paid, shipped, delivered, cancelled, refundedNone
total_amount_usdNUMERIC(10,2)Final order amount after discounts, before tax, in USDNone
shipping_address_idVARCHAR(36)FK to addresses.id with delivery addressPII
payment_methodVARCHAR(20)Payment type: credit_card, paypal, apple_pay, google_payNone
discount_codeVARCHAR(50)Promo code applied, NULL if noneNone
source_channelVARCHAR(30)Acquisition channel: organic, paid_social, email, affiliateNone
created_atTIMESTAMP_TZRow creation time in warehouseNone
updated_atTIMESTAMP_TZLast mutation of the rowNone

Example: Customers Table Data Dictionary

ColumnTypeDescriptionPII
customer_idVARCHAR(36)Unique customer identifierIndirect PII
emailVARCHAR(320)Customer's primary email, used for loginPII
first_nameVARCHAR(100)Legal first namePII
last_nameVARCHAR(100)Legal last namePII
phone_numberVARCHAR(20)E.164 format, for shipping notificationsPII
date_of_birthDATEFor age verification; NULL if not collectedSPI
country_codeCHAR(2)ISO 3166-1 alpha-2 country codeNone
signup_dateTIMESTAMP_TZWhen the account was createdNone
email_consentBOOLEANMarketing email opt-in status, governed by GDPRNone
lifetime_value_usdNUMERIC(10,2)Computed total spend across all ordersNone

How to Automate Data Dictionary Generation

Manual data dictionaries go stale within weeks. Modern governance programs generate dictionaries automatically from catalog metadata. Data Workers does this via the catalog agent: it ingests warehouse metadata, enriches it with LLM-generated descriptions, human-approves the descriptions, and publishes a living data dictionary that updates continuously.

The workflow: ingest from Snowflake/BigQuery → classify PII automatically → draft descriptions with an LLM → route to data stewards for approval → publish to the catalog → expose as MCP tools so AI agents can query the dictionary directly. Read the data dictionary best practices guide for more.

Common Data Dictionary Mistakes

  • Documenting only technical metadata, not business definitions
  • Writing generic descriptions like 'customer email' instead of 'primary login email, enforced unique'
  • Forgetting PII classification
  • Creating a dictionary in Excel and never updating it
  • Not tying dictionary entries to ownership
  • Treating the dictionary as documentation rather than a runtime asset

A great data dictionary example is the fastest way to show your team what good looks like. Copy the orders and customers templates above, adapt them to your warehouse, and automate the generation so the dictionary stays current. Book a demo to see how Data Workers generates and maintains living data dictionaries automatically.

See Data Workers in action

15 autonomous AI agents working across your entire data stack. MCP-native, open-source, deployed in minutes.

Book a Demo

Related Resources

Explore Topic Clusters