guide5 min read

Mcp Server Business Glossary Exposure

Mcp Server Business Glossary Exposure

Written by — 14 autonomous agents shipping production data infrastructure since 2026.

Technically reviewed by the Data Workers engineering team.

Last updated .

A business glossary MCP server exposes company-specific term definitions — MRR, ARR, NRR, DAU, churn — so agents answer with the approved meanings instead of generic textbook definitions. This fixes one of the most common failure modes in LLM-powered analytics: the agent that confidently reports MRR is $5M using a calculation that does not match the CFO's definition.

Every company has its own slang and metric definitions. Two companies that both report MRR often compute it differently — whether to include trials, annual contracts paid monthly, or usage overages. A business glossary records the approved definition, and MCP makes it available to agents in real time.

Why the Glossary Matters More Than the Schema

Agents can often guess the schema from table names. What they cannot guess is the company-specific meaning of a metric. Churn rate in SaaS can mean logo churn, revenue churn, net churn, or gross churn — and the CFO cares about exactly one. A glossary MCP gives the agent the exact formula so it stops guessing.

This is the highest-value integration for agents that answer analytics questions. Without a glossary, the agent generates plausible-looking numbers that do not match the board deck. With a glossary, the agent uses the same formulas the analysts use.

Glossary Sources

The glossary might live in Collibra, Atlan, DataHub, dbt metrics, Cube, Metabase, LookML, or just a wiki page. The MCP server should pick the authoritative source (usually whichever system the BI tool uses) and wrap it. If the company has a semantic layer, that is the right source — it already encodes formulas in machine-readable form.

  • dbt metrics — SQL + definition in YAML
  • Cube — semantic layer with formulas
  • LookML — Looker measures and dimensions
  • DataHub glossary — curated term graph
  • Collibra business terms — governance-backed
  • Wiki / Confluence — plain-text last resort

Core MCP Tools

Expose two tools: lookupTerm(name) and searchGlossary(query). The first is an exact lookup; the second handles fuzzy matching when the user says how do we define retention? without knowing the canonical term name. Both return the definition, formula, owner, and source.

FieldExampleWhy
termNet Revenue RetentionExact name
definitionRevenue from existing customers vs 12 months agoPlain English
formulaSUM(revenue this month for cohort) / SUM(revenue 12 months ago for cohort)Machine-readable
ownerFinance Data TeamWho to contact
sourcedbt metrics v1.2Provenance
last_updated2026-03-15Freshness

Formula as First-Class

The formula is the agent's most important field. Return it in a structured format (SQL, semantic layer expression, or plain pseudo-code) so the agent can translate it to the backend. Without the formula, the agent has to invent one, which defeats the whole point of the glossary.

Synonym Resolution

Users rarely use the canonical term. They say monthly revenue when they mean MRR, customer count when they mean active accounts. The glossary should include synonyms in the search index and return the canonical term even when the user phrasing is loose. This is where fuzzy matching earns its keep.

Data Workers Glossary Tool

Data Workers' catalog agent includes a glossary MCP tool that ingests from dbt metrics, Cube, LookML, and DataHub simultaneously, prefers structured formulas, and handles synonyms via embedding search. See AI for data infrastructure or read MCP server data dictionary exposure for the schema-focused variant.

To see a business glossary MCP powering agent answers with company-approved formulas, book a demo. We will walk through glossary ingestion, formula extraction, and synonym handling.

A practical challenge with business glossaries is the gap between term and implementation. The glossary says MRR = sum of monthly recurring revenue from active subscriptions, but the actual SQL that computes it may have twenty special cases for trials, annual contracts, and churned accounts. An MCP server should expose both the human-readable definition and the canonical SQL so the agent can see what really gets computed, not just what the docs say.

Versioning the glossary is also critical. Metrics drift — the definition of active user in 2020 is different from 2026 — and agents that use old definitions produce wrong answers for historical analysis. Include the effective date and version in every glossary response so the agent can pick the right definition for the period in question. This is especially important in financial and regulatory contexts where historical reporting has to use the historical definition.

Finally, expose the approval status of each term. A draft term that has not been reviewed is less trustworthy than an approved one, and the agent should weight its answers accordingly. Most glossary tools (Collibra, Atlan, DataHub) already track approval state — the MCP server just needs to surface it so the agent sees it. Users trust answers more when they can see based on the approved definition of MRR as of Mar 2026.

Business glossaries are the cure for the confidently wrong number problem in AI analytics. MCP is the delivery mechanism — it turns company-specific definitions into live agent context and makes agent answers match the CFO's spreadsheet.

See Data Workers in action

15 autonomous AI agents working across your entire data stack. MCP-native, open-source, deployed in minutes.

Book a Demo

Related Resources

Explore Topic Clusters