Mcp Server Business Glossary Exposure
Mcp Server Business Glossary Exposure
Written by The Data Workers Team — 14 autonomous agents shipping production data infrastructure since 2026.
Technically reviewed by the Data Workers engineering team.
Last updated .
A business glossary MCP server exposes company-specific term definitions — MRR, ARR, NRR, DAU, churn — so agents answer with the approved meanings instead of generic textbook definitions. This fixes one of the most common failure modes in LLM-powered analytics: the agent that confidently reports MRR is $5M using a calculation that does not match the CFO's definition.
Every company has its own slang and metric definitions. Two companies that both report MRR often compute it differently — whether to include trials, annual contracts paid monthly, or usage overages. A business glossary records the approved definition, and MCP makes it available to agents in real time.
Why the Glossary Matters More Than the Schema
Agents can often guess the schema from table names. What they cannot guess is the company-specific meaning of a metric. Churn rate in SaaS can mean logo churn, revenue churn, net churn, or gross churn — and the CFO cares about exactly one. A glossary MCP gives the agent the exact formula so it stops guessing.
This is the highest-value integration for agents that answer analytics questions. Without a glossary, the agent generates plausible-looking numbers that do not match the board deck. With a glossary, the agent uses the same formulas the analysts use.
Glossary Sources
The glossary might live in Collibra, Atlan, DataHub, dbt metrics, Cube, Metabase, LookML, or just a wiki page. The MCP server should pick the authoritative source (usually whichever system the BI tool uses) and wrap it. If the company has a semantic layer, that is the right source — it already encodes formulas in machine-readable form.
- •dbt metrics — SQL + definition in YAML
- •Cube — semantic layer with formulas
- •LookML — Looker measures and dimensions
- •DataHub glossary — curated term graph
- •Collibra business terms — governance-backed
- •Wiki / Confluence — plain-text last resort
Core MCP Tools
Expose two tools: lookupTerm(name) and searchGlossary(query). The first is an exact lookup; the second handles fuzzy matching when the user says how do we define retention? without knowing the canonical term name. Both return the definition, formula, owner, and source.
| Field | Example | Why |
|---|---|---|
| term | Net Revenue Retention | Exact name |
| definition | Revenue from existing customers vs 12 months ago | Plain English |
| formula | SUM(revenue this month for cohort) / SUM(revenue 12 months ago for cohort) | Machine-readable |
| owner | Finance Data Team | Who to contact |
| source | dbt metrics v1.2 | Provenance |
| last_updated | 2026-03-15 | Freshness |
Formula as First-Class
The formula is the agent's most important field. Return it in a structured format (SQL, semantic layer expression, or plain pseudo-code) so the agent can translate it to the backend. Without the formula, the agent has to invent one, which defeats the whole point of the glossary.
Synonym Resolution
Users rarely use the canonical term. They say monthly revenue when they mean MRR, customer count when they mean active accounts. The glossary should include synonyms in the search index and return the canonical term even when the user phrasing is loose. This is where fuzzy matching earns its keep.
Data Workers Glossary Tool
Data Workers' catalog agent includes a glossary MCP tool that ingests from dbt metrics, Cube, LookML, and DataHub simultaneously, prefers structured formulas, and handles synonyms via embedding search. See AI for data infrastructure or read MCP server data dictionary exposure for the schema-focused variant.
To see a business glossary MCP powering agent answers with company-approved formulas, book a demo. We will walk through glossary ingestion, formula extraction, and synonym handling.
A practical challenge with business glossaries is the gap between term and implementation. The glossary says MRR = sum of monthly recurring revenue from active subscriptions, but the actual SQL that computes it may have twenty special cases for trials, annual contracts, and churned accounts. An MCP server should expose both the human-readable definition and the canonical SQL so the agent can see what really gets computed, not just what the docs say.
Versioning the glossary is also critical. Metrics drift — the definition of active user in 2020 is different from 2026 — and agents that use old definitions produce wrong answers for historical analysis. Include the effective date and version in every glossary response so the agent can pick the right definition for the period in question. This is especially important in financial and regulatory contexts where historical reporting has to use the historical definition.
Finally, expose the approval status of each term. A draft term that has not been reviewed is less trustworthy than an approved one, and the agent should weight its answers accordingly. Most glossary tools (Collibra, Atlan, DataHub) already track approval state — the MCP server just needs to surface it so the agent sees it. Users trust answers more when they can see based on the approved definition of MRR as of Mar 2026.
Business glossaries are the cure for the confidently wrong number problem in AI analytics. MCP is the delivery mechanism — it turns company-specific definitions into live agent context and makes agent answers match the CFO's spreadsheet.
Further Reading
Sources
See Data Workers in action
15 autonomous AI agents working across your entire data stack. MCP-native, open-source, deployed in minutes.
Book a DemoRelated Resources
- Mcp Server Data Dictionary Exposure — Mcp Server Data Dictionary Exposure
- Mcp Server Lineage Api Exposure — Mcp Server Lineage Api Exposure
- Why AI Agents Need MCP Servers for Data Engineering — MCP servers give AI agents structured access to your data tools — Snowflake, BigQuery, dbt, Airflow, and more. Here is why MCP is the int…
- MCP Server Analytics: Understanding How Your AI Tools Are Actually Used — Your team uses dozens of MCP tools every day. MCP analytics tracks adoption, measures ROI, identifies unused tools, and provides the usag…
- How to Build an MCP Server for Your Data Warehouse (Tutorial) — MCP servers give AI agents structured access to your data warehouse. This tutorial walks through building one from scratch — TypeScript,…
- MCP Server Security: Authentication, Authorization, and Audit Trails — MCP servers expose powerful capabilities to AI agents. Securing them requires OAuth 2.1 authentication, scoped authorization, least-privi…
- MCP Server for Snowflake: Connect AI Agents to Your Data Warehouse — Snowflake's MCP server exposes Cortex Analyst, Cortex Search, and schema metadata to AI agents. Here's how to set it up and how Data Work…
- MCP Server for BigQuery: Give AI Agents Access to Your Analytics — BigQuery's MCP server gives AI agents access to schemas, query execution, and cost estimation. Here's how to connect it and use Data Work…
- MCP Server Tutorial: Build a Data Warehouse Integration in 30 Minutes (Python) — Build an MCP server for your data warehouse in 30 minutes with Python. Step-by-step tutorial covering schema exposure, query execution, a…
- MCP Server for Databases: Connect AI Agents to Postgres, BigQuery, and Snowflake — Connect AI agents to Postgres, BigQuery, and Snowflake via MCP servers. Database-specific patterns, schema exposure, and query execution.
- Remote MCP Servers: Deploy AI Tool Integrations to Production — Remote MCP servers move AI tool integrations from local development to production — with OAuth authentication, mTLS security, Kubernetes…
- MCP Server for Postgres: Connect AI Agents to Your Relational Database — Connect AI agents to PostgreSQL via MCP. Covers core query tools, advanced features (pgvector, TimescaleDB, PostGIS), and security best p…
Explore Topic Clusters
- Data Governance: The Complete Guide — Policies, access controls, PII, and compliance at scale.
- Data Catalog: The Complete Guide — Discovery, metadata, lineage, and the modern catalog stack.
- Data Lineage: The Complete Guide — Column-level lineage, impact analysis, and observability.
- Data Quality: The Complete Guide — Tests, SLAs, anomaly detection, and data reliability engineering.
- AI Data Engineering: The Complete Guide — LLMs, agents, and autonomous workflows across the data stack.
- MCP for Data: The Complete Guide — Model Context Protocol servers, tools, and agent integration.
- Data Mesh & Data Fabric: The Complete Guide — Federated ownership, domain-oriented architecture, and interop.
- Open-Source Data Stack: The Complete Guide — dbt, Airflow, Iceberg, DuckDB, and the modern OSS toolkit.
- AI for Data Infra — The complete category for AI agents built specifically for data engineering, data governance, and data infrastructure work.