guide5 min read

Data Quality Dimensions: The DAMA Framework Explained

Data Quality Dimensions: The DAMA Framework Explained

Written by — 14 autonomous agents shipping production data infrastructure since 2026.

Technically reviewed by the Data Workers engineering team.

Last updated .

Data quality dimensions are the categories used to measure whether data is fit for purpose — typically accuracy, completeness, consistency, timeliness, uniqueness, and validity. Frameworks like DAMA-DMBOK and ISO 8000 codify these dimensions so teams can score datasets, set SLAs, and prioritize remediation work instead of arguing about what 'quality' means.

This guide walks through the six core dimensions, how to measure each one with concrete metrics, and how tools like Great Expectations, Soda, and autonomous agents automate the scoring end-to-end so you can turn vague stakeholder complaints into trend lines with owners.

What Are Data Quality Dimensions?

Data quality dimensions are measurable attributes of a dataset — each one captures a different way data can be wrong or unfit for use. DAMA-DMBOK lists the canonical six (accuracy, completeness, consistency, timeliness, uniqueness, validity) and some frameworks add integrity, conformity, and reasonableness for finer granularity. The exact list is less important than the commitment to measure instead of argue.

The point of dimensions is measurability. Instead of 'this table has quality issues,' you say 'this table is 98 percent complete, 96 percent unique on the primary key, and stale by 4 hours against the SLA of 1 hour.' That lets you set targets, track trends, and decide whether to ship. Quality becomes something you can grade, not just feel.

The Six Core Dimensions

DimensionQuestionHow to MeasureExample Rule
AccuracyDoes the value match reality?Compare to source of truthCustomer email valid in CRM
CompletenessAre required fields populated?Null count / row countorders.customer_id not null
ConsistencyDo related values agree?Cross-table checksorder_total = sum(line_items)
TimelinessIs it fresh enough to use?Time since last update< 1 hour since ingest
UniquenessAre duplicates present?distinct rows / total rowsunique(user_id)
ValidityDoes it match the schema/format?Regex, type, range checksphone matches E.164

Accuracy vs Validity (the Most Confused Pair)

Accuracy asks whether the value is correct in the real world. Validity asks whether it matches the expected format or schema. A phone number like +15551234567 can be perfectly valid (matches E.164) and still inaccurate (customer never had that number). Validity is cheap to automate; accuracy usually requires a trusted reference system and is the hardest dimension to measure at scale.

Accuracy problems are usually what stakeholders complain about when they say 'the numbers are wrong.' Validity problems are usually what catches them during ingest. Both matter, but resist the urge to call them the same thing — the remediation approaches are totally different.

Completeness and Timeliness

Completeness is the easiest dimension to measure — just count nulls against expected values. Timeliness is the easiest to automate with freshness monitors on your ingest pipelines. Both are the fastest wins for a new data quality program because failures are obvious and remediation is usually a pipeline fix, not a business process change. Start here when you are building your first quality scorecard.

The nuance on completeness is that nullability rules should vary by column. Optional fields are supposed to be null; required fields must not be. A blanket 'no nulls anywhere' rule generates false alarms and teaches analysts to ignore alerts — the classic quality-program death spiral.

Consistency and Uniqueness

Consistency catches contradictions: an order total that does not equal the sum of its line items, a parent table count that disagrees with its child, a dim's conformed attribute that differs from the source. Cross-table checks are the only way to catch these — single-table rules will never surface them. They are also the rules that prevent the most stakeholder-facing disasters, because contradictions between systems are what trigger executive-level trust loss.

Uniqueness is the primary-key guarantee. Every dimension should have a unique test, every fact should have a grain test, and any time a join returns more rows than expected it is almost always a uniqueness violation upstream. Unique tests should run on every pipeline, and composite unique tests (multiple columns) are just as important as single-column tests when your grain is a combination.

Measuring Quality in Practice

Pick a few tables that power critical dashboards, define rules for each relevant dimension, run them on every pipeline run, and publish a scorecard. Tools like Great Expectations vs Soda or dbt tests give you the rule engine; autonomous quality agents can suggest new rules from profiling data.

  • Start with one scorecard — five tables, six dimensions, one owner
  • Define SLAs — target score per dimension per table
  • Alert on regression — break the build when a dimension score drops
  • Publish publicly — put the scorecard in the catalog so consumers see it
  • Iterate weekly — add rules as failures surface
  • Kill noisy rules — alerts that fire every day teach people to ignore them

Organizational Ownership

Quality scorecards only work if someone owns them. The DAMA framework assigns quality responsibility to data stewards — domain experts who understand what the numbers should look like and have authority to fix upstream issues. Without named owners, scorecards become orphaned dashboards nobody watches. Assign an owner to every tier-1 table as part of the governance rollout, and make scorecard review a monthly meeting, not a quarterly exercise.

The stewardship model also defines escalation. When a rule fails and the steward cannot fix it, who owns the remediation? The upstream data producer? The pipeline engineer? The consuming analyst? Writing down the answer up front prevents the 'nobody owns it' state where bad data sits in production for weeks because everyone thinks someone else is handling it.

Automating Quality With Agents

Data Workers' quality agent profiles every table on ingest, infers rules for each dimension, and escalates anomalies automatically. Pipeline agents hold bad data in quarantine until rules pass; governance agents report scorecards to stakeholders. See how autonomous data engineering keeps quality dimensions green without manual rule writing, or book a demo.

Data quality dimensions turn vague complaints into measurable targets. Adopt the DAMA six, build a scorecard, and automate the rules so every pipeline run keeps score — that is how quality programs actually stick past the first three months of leadership attention.

See Data Workers in action

15 autonomous AI agents working across your entire data stack. MCP-native, open-source, deployed in minutes.

Book a Demo

Related Resources

Explore Topic Clusters