Slowly Changing Dimensions Types: Types 0 Through 6
Slowly Changing Dimensions Types: Types 0 Through 6
Written by The Data Workers Team — 14 autonomous agents shipping production data infrastructure since 2026.
Technically reviewed by the Data Workers engineering team.
Last updated .
The seven slowly changing dimension types — 0 through 6 — define how dimension tables handle attribute changes over time. Type 0 retains the original, Type 1 overwrites, Type 2 versions rows, Type 3 adds a prior-value column, Type 4 splits into a history table, Type 5 uses a mini-dimension, and Type 6 hybridizes them all.
This guide details each SCD type with a concrete example, the tradeoffs, and when each makes sense in a modern warehouse. Most real warehouses use Types 1 and 2 for 95 percent of attributes, with the exotic types reserved for specific edge cases.
Type 0: Retain Original
Type 0 says the value is fixed at load and never updates. Classic examples are date of birth, account open date, or an immutable customer ID. No versioning needed because the value cannot change. It is the simplest type and useful for attributes where a later update would actually be a data error rather than a legitimate change.
Type 0 is rare in analytics dimensions but common in reference data and master data. If you find yourself updating a Type 0 attribute, that is a signal that the attribute was not actually immutable — reclassify it as Type 1 or Type 2 and investigate why the update happened.
Type 1: Overwrite
Type 1 updates the row in place and loses the old value. Use it when history does not matter (typo corrections) or when the attribute is operational rather than historical (customer's current loyalty tier). The risk is silent data drift — historical reports that grouped by the overwritten value will now return different numbers.
Rule of thumb: if a stakeholder asks 'what was this value on this date', Type 1 is wrong. If they only care about current state, Type 1 is fine and cheaper to maintain than any other type.
Type 2: Add New Row (The Workhorse)
Type 2 is the most common. When a tracked attribute changes, you mark the existing row's dbt_valid_to with today's date and insert a new row with a new surrogate key. Facts link to the surrogate key, so historical facts still point to the old version. This is the Kimball answer to 'what was the customer's city when this order shipped?'
Implement Type 2 with start_date, end_date, and is_current flag. dbt snapshots automate the whole pattern — see dbt snapshots explained. The main concern is row explosion on frequently changing attributes; switch to Type 5 when Type 2 bloats the dim beyond query performance.
Type 3: Add New Attribute
Type 3 adds a prior_value column instead of versioning rows. It is useful when you only care about the most recent change — prior_region, original_region — and cannot afford the row growth of Type 2. Works for a single step back in time but not deeper history, so it is best for 'compare current to previous' analysis rather than full audit trails.
Type 4: History Table
Type 4 keeps the current row in the main dimension and moves all historical versions to a separate history table. Queries that need current state read the fast main dim; audit queries read the history. Data Vault satellites are essentially Type 4 baked into the methodology. This is the right pattern when current-state queries dominate traffic but you still need full history on demand.
Type 5: Mini-Dimension + Type 4
Type 5 is for attributes that change so often (monthly demographics, subscription tier) that Type 2 would explode the main dim. You move the volatile attributes into a mini-dimension and link from facts through the mini-dim surrogate key. It is the Kimball answer to rapidly changing demographics without polluting slow-moving dim columns with versioning overhead.
Type 6: Hybrid (1+2+3)
Type 6 combines Type 1, Type 2, and Type 3 into one dimension. You version rows like Type 2 for history, keep a current_value column updated on every row like Type 1 for fast current queries, and a prior_value column for previous state. Maximum flexibility at maximum complexity — use only when you genuinely need all three behaviors on the same row.
Summary Table
| Type | Writes | History | Best For |
|---|---|---|---|
| 0 | Never | N/A | Immutable attributes |
| 1 | Overwrite | None | Corrections |
| 2 | Insert new row | Full | Audit-sensitive dimensions |
| 3 | Update prior column | One step | Simple lookback |
| 4 | Main dim + history table | Full | Performance-sensitive current queries |
| 5 | Main + mini-dim | Full (in mini) | Rapid changes |
| 6 | All of the above | Full | Maximum flexibility |
Picking the Right Type
Most real warehouses use Type 1 for ~20 percent of attributes and Type 2 for the other 80 percent, with the exotic types appearing only in edge cases. See slowly changing dimensions for the foundational intro and data vault vs kimball for the alternative approach.
The decision tree is: does anyone need history on this attribute? No → Type 1. Yes → Type 2 unless it changes so often that row explosion is a concern, in which case Type 5. Everything else is an edge case. Do not let SCD types become a design debate — pick Type 1 or Type 2 per attribute and move on to more important architectural decisions.
Testing SCDs
Every Type 2 dimension should have tests that verify exactly one current row per business key, no overlapping valid date ranges, and monotonically increasing row counts over time. Without these tests, a bad pipeline run can corrupt history silently and you will not notice until an auditor asks for something. Elementary, dbt-expectations, and Soda all ship templates for these checks.
The row count test is the most important single SCD test. If your Type 2 dim has 10 million rows today and only 7 million tomorrow, something deleted history — and the only way to catch it is a trend test on raw rowcount. This single test catches most catastrophic SCD regressions that slipped past every other check.
Agent-Managed SCDs
Data Workers' catalog and migration agents can detect when a dimension needs history and generate dbt snapshots automatically. See autonomous data engineering or book a demo.
The seven SCD types are a toolbox, not a menu. Start with Type 1 for corrections and Type 2 for meaningful history, and reach for the exotic types only when the standard patterns stop working. Most warehouses will never need Types 4, 5, or 6 — do not overthink it.
Further Reading
Sources
See Data Workers in action
15 autonomous AI agents working across your entire data stack. MCP-native, open-source, deployed in minutes.
Book a DemoRelated Resources
- Slowly Changing Dimensions: The Complete Guide to SCD Types — Foundational guide to slowly changing dimensions: all six types, when to use each, and how dbt snapshots automate the pattern.
- Data Quality Dimensions: The DAMA Framework Explained — Guide to the six DAMA data quality dimensions, how to measure each, and how autonomous agents automate the scoring.
- Why AI Agents Need MCP Servers for Data Engineering — MCP servers give AI agents structured access to your data tools — Snowflake, BigQuery, dbt, Airflow, and more. Here is why MCP is the int…
- The Complete Guide to Agentic Data Engineering with MCP — Agentic data engineering replaces manual pipeline management with autonomous AI agents. Here is how to implement it with MCP — without lo…
- How AI Agents Cut Snowflake Costs by 40% Without Manual Tuning — Most Snowflake environments waste 30-40% of compute on zombie tables, oversized warehouses, and unoptimized queries. AI agents find and f…
- RBAC for Data Engineering Teams: Why Manual Access Control Doesn't Scale — Manual RBAC breaks down at 50+ data assets. Policy drift, orphaned permissions, and PII exposure become inevitable. AI agents enforce gov…
- From Alert to Resolution in Minutes: How AI Agents Debug Data Pipeline Incidents — The average data pipeline incident takes 4-8 hours to resolve. AI agents that understand your full data context can auto-diagnose and res…
- Build Data Pipelines with AI: From Description to Deployment in Minutes — Building a data pipeline still takes 2-6 weeks of engineering time. AI agents that understand your data context can generate, test, and d…
- Why Your Data Catalog Is Always Out of Date (And How AI Agents Fix It) — 40-60% of data catalog entries are outdated at any given time. AI agents that continuously scan, classify, and update metadata make the s…
- MLOps in 2026: Why Teams Are Moving from Tools to AI Agents — The average ML team uses 5-7 MLOps tools. AI agents that manage the full ML lifecycle — from experiment tracking to model deployment — ar…
- Why Text-to-SQL Accuracy Drops from 85% to 20% in Production (And How to Fix It) — Text-to-SQL tools score 85% on benchmarks but drop to 10-20% accuracy on real enterprise schemas. The fix is not better models — it is a…
- Data Migration Automation: How AI Agents Reduce 18-Month Timelines to Weeks — Enterprise data migrations take 6-18 months because schema mapping, data validation, and downtime coordination are manual. AI agents comp…
Explore Topic Clusters
- Data Governance: The Complete Guide — Policies, access controls, PII, and compliance at scale.
- Data Catalog: The Complete Guide — Discovery, metadata, lineage, and the modern catalog stack.
- Data Lineage: The Complete Guide — Column-level lineage, impact analysis, and observability.
- Data Quality: The Complete Guide — Tests, SLAs, anomaly detection, and data reliability engineering.
- AI Data Engineering: The Complete Guide — LLMs, agents, and autonomous workflows across the data stack.
- MCP for Data: The Complete Guide — Model Context Protocol servers, tools, and agent integration.
- Data Mesh & Data Fabric: The Complete Guide — Federated ownership, domain-oriented architecture, and interop.
- Open-Source Data Stack: The Complete Guide — dbt, Airflow, Iceberg, DuckDB, and the modern OSS toolkit.