Mcp Server Unity Catalog Metadata
Mcp Server Unity Catalog Metadata
Written by The Data Workers Team — 14 autonomous agents shipping production data infrastructure since 2026.
Technically reviewed by the Data Workers engineering team.
Last updated .
A Unity Catalog MCP server queries the Databricks metadata APIs via service principal, exposes catalog hierarchy, lineage, and tags through agent tools, and inherits Unity Catalog's native grant enforcement. Because Unity Catalog is the governance layer for Databricks, MCP here means delivering agent-friendly access to a metadata graph that already governs every query.
Unity Catalog is Databricks' unified governance layer for data and AI, and it covers catalog, schema, tables, views, volumes, models, and functions. Exposing it through MCP lets agents navigate the full lakehouse catalog — including ML models and unstructured data — with the same grants as humans. This guide covers the setup.
Why Unity Catalog Is Different
Most catalogs sit outside the data platform. Unity Catalog is the data platform's governance layer — it owns the grants that Databricks enforces on every query. That means an MCP server on Unity Catalog cannot leak what the service principal cannot access, because the Databricks runtime itself enforces the rules. It is the strongest governance story in the catalog space.
Unity Catalog also covers more than tables. Functions, models, volumes, and even AI/BI dashboards are catalog objects with lineage. An MCP server can expose the full hierarchy to agents, letting them discover ML models alongside tables without hopping between systems.
Service Principal Authentication
Unity Catalog uses Databricks service principals for automation. Create a service principal, grant it USE CATALOG and SELECT on the curated catalogs, and generate an OAuth token. Load the token into the MCP server via environment variable. The SDK handles token refresh automatically.
- •Service principal — dedicated for MCP
- •USE CATALOG + SELECT — scoped grants
- •OAuth tokens — short-lived, auto-refreshed
- •Workspace binding — one workspace per SP
- •Account console — enable UC metadata APIs
Core MCP Tools
Expose six tools: listCatalogs, listSchemas, getTable, getLineage, getColumns, and getTags. The getLineage tool wraps Unity Catalog's lineage system tables, which track column-level lineage automatically for every SQL query Databricks runs. This gives agents lineage for free on any dbt or SQL workflow.
| Tool | Unity API | Purpose |
|---|---|---|
| listCatalogs | /api/2.1/unity-catalog/catalogs | Top-level enumeration |
| listSchemas | /api/2.1/unity-catalog/schemas | Schemas in a catalog |
| getTable | /api/2.1/unity-catalog/tables/{fqn} | Full table metadata |
| getLineage | system.access.table_lineage | Lineage system table |
| getColumns | table.columns | Column metadata |
| getTags | /api/2.1/unity-catalog/tags | Tag graph |
System Tables for Lineage
Unity Catalog's killer feature for MCP is the system.access.table_lineage table, which records every read and write at column-level granularity. The MCP server can query this table directly to answer lineage questions without running a separate lineage service. It is also the source of the Unity Catalog UI's lineage view, so agents see exactly what humans see.
Tag-Based Filtering
Unity Catalog supports tag-based policies — you can attach a pii=true tag to a table or column and write a policy that restricts access. The MCP server inherits this enforcement automatically because every query goes through Unity Catalog. The agent never sees PII it is not authorized to see, without any MCP-layer filtering logic.
Data Workers on Unity Catalog
Data Workers' Unity Catalog connector authenticates via service principal, exposes the six tools above, and queries the lineage system tables for column-level walks. It federates Unity Catalog with other catalogs through the unified catalog interface. See AI for data infrastructure or read MCP server Databricks Unity Catalog for the query execution side.
To see Unity Catalog MCP with lineage and tag enforcement on a real lakehouse, book a demo. We will walk through service principal setup, lineage queries, and tag policies.
Unity Catalog's support for ML models and feature tables makes it especially valuable for MCP. The catalog records model versions, training data lineage, and serving metadata, so an agent can answer questions about the ML stack using the same MCP interface it uses for data questions. This unified view is rare — most catalogs treat ML as a separate tool, and most feature stores have no catalog integration at all.
The system.access.audit table is another Unity Catalog system table worth exposing via MCP. It records every access event across the workspace, including service principal actions, so the agent can answer who queried this table today? without running against external logs. For compliance and forensics, this is a killer capability — the audit trail lives in the same query engine the agent already uses.
Unity Catalog also supports materialized views and streaming tables, both of which can be exposed through the same MCP tools used for regular tables. The agent does not need to know whether a table is streaming or batch; it queries the catalog and gets the freshest data. This transparency is one of the reasons Unity Catalog plus MCP is particularly clean — there are fewer concepts for the agent to track.
Unity Catalog is the tightest catalog-plus-execution integration in the market, and MCP is the cleanest way to give agents access. Service principal auth, six core tools, and system table lineage make for a production-grade agent interface.
Further Reading
Sources
See Data Workers in action
15 autonomous AI agents working across your entire data stack. MCP-native, open-source, deployed in minutes.
Book a DemoRelated Resources
- Mcp Server Databricks Unity Catalog — Mcp Server Databricks Unity Catalog
- Mcp Server Datahub Metadata — Mcp Server Datahub Metadata
- Mcp Server Amundsen Metadata — Mcp Server Amundsen Metadata
- Mcp Server Collibra Metadata — Mcp Server Collibra Metadata
- Mcp Server Atlan Metadata — Mcp Server Atlan Metadata
- Mcp Server Alation Metadata — Mcp Server Alation Metadata
- Why AI Agents Need MCP Servers for Data Engineering — MCP servers give AI agents structured access to your data tools — Snowflake, BigQuery, dbt, Airflow, and more. Here is why MCP is the int…
- MCP Server Analytics: Understanding How Your AI Tools Are Actually Used — Your team uses dozens of MCP tools every day. MCP analytics tracks adoption, measures ROI, identifies unused tools, and provides the usag…
- How to Build an MCP Server for Your Data Warehouse (Tutorial) — MCP servers give AI agents structured access to your data warehouse. This tutorial walks through building one from scratch — TypeScript,…
- MCP Server Security: Authentication, Authorization, and Audit Trails — MCP servers expose powerful capabilities to AI agents. Securing them requires OAuth 2.1 authentication, scoped authorization, least-privi…
- MCP Server for Snowflake: Connect AI Agents to Your Data Warehouse — Snowflake's MCP server exposes Cortex Analyst, Cortex Search, and schema metadata to AI agents. Here's how to set it up and how Data Work…
- MCP Server for BigQuery: Give AI Agents Access to Your Analytics — BigQuery's MCP server gives AI agents access to schemas, query execution, and cost estimation. Here's how to connect it and use Data Work…
Explore Topic Clusters
- Data Governance: The Complete Guide — Policies, access controls, PII, and compliance at scale.
- Data Catalog: The Complete Guide — Discovery, metadata, lineage, and the modern catalog stack.
- Data Lineage: The Complete Guide — Column-level lineage, impact analysis, and observability.
- Data Quality: The Complete Guide — Tests, SLAs, anomaly detection, and data reliability engineering.
- AI Data Engineering: The Complete Guide — LLMs, agents, and autonomous workflows across the data stack.
- MCP for Data: The Complete Guide — Model Context Protocol servers, tools, and agent integration.
- Data Mesh & Data Fabric: The Complete Guide — Federated ownership, domain-oriented architecture, and interop.
- Open-Source Data Stack: The Complete Guide — dbt, Airflow, Iceberg, DuckDB, and the modern OSS toolkit.
- AI for Data Infra — The complete category for AI agents built specifically for data engineering, data governance, and data infrastructure work.