guide5 min read

Mcp Server Openmetadata Lineage

Mcp Server Openmetadata Lineage

Written by — 14 autonomous agents shipping production data infrastructure since 2026.

Technically reviewed by the Data Workers engineering team.

Last updated .

OpenMetadata exposes a REST API that an MCP server can wrap to give agents search, entity lookups, and end-to-end lineage walks across warehouses, dashboards, and pipelines. The key moves are authenticating with a JWT service token, wrapping the lineage API behind a simple tool, and filtering out entities the agent should not see.

OpenMetadata is an open-source metadata platform with strong lineage support — it tracks column-level lineage across dbt, Airflow, Snowflake, BigQuery, and more. This guide covers how to expose it through MCP so agents can answer lineage questions without leaving the chat.

Why OpenMetadata Shines for Lineage

OpenMetadata's biggest differentiator is column-level lineage across tools. It ingests metadata from warehouses, BI tools, and orchestrators, then builds a unified graph where you can trace a column from a dashboard back to the raw source table through every dbt transformation. Exposing that graph to agents via MCP is transformative for debugging and impact analysis.

The alternative is asking the agent to reconstruct lineage from code — slow, error-prone, and often impossible for no-code pipelines. OpenMetadata already did the work; MCP just makes it accessible.

JWT Authentication

OpenMetadata supports JWT-based service accounts. Create a bot user in the admin UI, generate a JWT, and load it into the MCP server as OPENMETADATA_TOKEN. The bot should have the ViewAll role — no edit or delete permissions. Rotate the JWT quarterly and keep it in a secrets manager.

  • Bot user — not a human account
  • ViewAll role — read-only across entities
  • JWT in env var — loaded from secrets manager
  • HTTPS only — TLS to the OM API
  • Rate limited — honor API quotas

Core MCP Tools

A useful OpenMetadata MCP server exposes a handful of tools: searchEntities, getTable, getLineage, getColumnLineage, getGlossaryTerm, and getOwners. Each maps to an OpenMetadata REST endpoint and returns a trimmed response. The column lineage tool is the most distinctive — it returns the full upstream chain for a single column across tools.

ToolREST EndpointPurpose
searchEntities/api/v1/search/queryKeyword search
getTable/api/v1/tables/{fqn}Full table metadata
getLineage/api/v1/lineage/{entity}Entity lineage graph
getColumnLineage/api/v1/lineage/getLineageEdgeColumn-level trace
getGlossaryTerm/api/v1/glossaryTerms/{id}Business definition
getOwners/api/v1/tables/{fqn}/ownersContact info

Column-Level Lineage Walks

Column-level lineage is the power feature. When an agent is asked where does the total_revenue column in the exec dashboard come from?, the MCP server calls getColumnLineage and walks upstream through every dbt model and SQL transformation until it hits the source system. The response is a graph of nodes (columns) and edges (transformations) the agent can summarize for the user.

Filtering Sensitive Entities

OpenMetadata supports tags, and tags often encode PII or sensitivity levels. The MCP server should strip entities tagged PII.Sensitive from search results unless the agent is explicitly authorized. This keeps sensitive context out of the prompt and enforces governance at the MCP layer rather than downstream.

Observability

Log every MCP tool call with the bot user, the tool name, the arguments, and the response size. Join this with OpenMetadata's own audit log to reconstruct agent activity. A surprising amount of insight comes from noticing which entities the agent asks about — it reveals gaps in the catalog documentation.

Data Workers on OpenMetadata

Data Workers' OpenMetadata connector handles JWT auth, exposes the column-lineage tool, and enforces tag-based filtering. It can federate with DataHub, Unity Catalog, and Atlan via the unified catalog interface. See AI for data infrastructure for the full agent stack, or read MCP server DataHub metadata for a comparison.

To see an OpenMetadata MCP server walking column lineage live, book a demo. We will show a column-to-source trace across dbt, Airflow, and a warehouse.

OpenMetadata's data quality features are another area where MCP adds value. The platform tracks test suite results, data profiler output, and quality scores at the table and column level. An MCP tool that surfaces these signals lets the agent reason about data quality before citing a table in an answer — a kind of automated sanity check that prevents the agent from confidently citing broken data.

The platform's conversational threads on entities are also underused. OpenMetadata lets users post comments, questions, and announcements on datasets, and those threads hold tribal knowledge that schemas and docs do not capture. Exposing threads via an MCP getDiscussion tool gives the agent access to the ongoing conversation about a dataset — the this column is wrong on weekends and use the v2 table instead notes that humans leave for each other.

OpenMetadata also supports entity versioning, so you can see how a table's schema and documentation have changed over time. For governance-sensitive use cases, the MCP server can expose a getHistory tool that returns the change history. This lets the agent answer when did this column appear? or when was this definition updated? questions that humans would otherwise have to dig through version control for.

OpenMetadata is the best open-source option for column-level lineage, and MCP is the right way to expose that lineage to agents. Bot user auth, six core tools, and tag-based filtering give you a production-grade metadata interface in an afternoon.

See Data Workers in action

15 autonomous AI agents working across your entire data stack. MCP-native, open-source, deployed in minutes.

Book a Demo

Related Resources

Explore Topic Clusters