Your Team Uses Dozens of MCP Tools Every Day. Can You See How?
MCP is becoming the universal protocol for data tooling. But nobody is tracking how practitioners actually use it. We built the agent that does.
By The Data Workers Team
Your data team connects to MCP servers all day. Snowflake MCPs, dbt MCPs, internal agents, open-source tools, vendor integrations — the ecosystem is growing fast. A single practitioner might touch dozens of MCP tools in a week across half a dozen servers. But you cannot answer the most basic question: which ones are people actually using?
This is the question every data platform leader eventually asks. Not "are the services healthy?" — Datadog answers that. Not "is the data fresh?" — Monte Carlo answers that. The question is: how is my team interacting with our tooling, and where should I invest next?
The Gap Nobody Talks About
Data teams invest millions in tooling across their stack — Snowflake, dbt, Airflow, AI agents, and a growing mesh of MCP servers. But they have zero visibility into how practitioners actually use these tools. They cannot answer:
- •Which tools and agents do our data engineers use the most? The least?
- •What are the common workflow patterns? Do people chain catalog lookups into pipeline builds, or use them in isolation?
- •Which team members are power users? Who is still stuck on manual workflows?
- •Are we getting ROI from tools we are paying for? Which agents are shelfware?
- •When do usage patterns shift? Did the new schema agent drive adoption after rollout, or did nobody notice?
Traditional observability watches infrastructure. APM tools tell you if a service is healthy. Product analytics tools like Amplitude track web app clicks. Nobody tracks data practitioner workflows across MCP tool interactions. This is the gap.
MCP Tool Calls Are First-Class Usage Events
Here is the insight that unlocked everything: every MCP tool call — from any server, any vendor, any agent — is a signal of human intent. When a practitioner invokes validate_schema or run_query or deploy_model, that is not just an API call — it is a usage event with semantic meaning. It tells you what the person was trying to do, which part of the platform they reached for, and whether it worked.
MCP is becoming the universal protocol for data tooling. That means every tool interaction flows through the same interface. By treating MCP tool calls as first-class usage telemetry — regardless of which server they come from — you get something no other platform offers: a map of how data teams actually work across their entire tooling ecosystem.
What the Usage Intelligence Agent Does
We built 7 new analytics tools on top of this insight, plus retained the full agent observability capabilities (audit trails, health monitoring, drift detection). 13 tools total, all deterministic — zero LLM in the processing path. They work across any MCP server your team connects to.
Tool Usage Metrics. Which MCP tools get called the most? By how many unique users? Is usage trending up or down? Group by tool, server, or individual practitioner. When a VP asks "is anyone using that new governance server we deployed?" you have the answer in seconds.
Adoption Dashboards. For every MCP server on the platform: adoption rate, active users, week-over-week growth, top tools, and underused tools. Classification into four buckets — fully adopted, growing, underused, or shelfware. The fastest-growing server and the one that needs attention, surfaced automatically.
Workflow Pattern Detection. This is where it gets interesting. By analyzing session data, we identify the common multi-tool sequences practitioners follow — even when those tools span different MCP servers. The top pattern in our data: catalog:search_assets → schema:validate_schema → pipelines:build_pipeline → quality:run_quality_check — a discovery-to-deployment workflow that 8 practitioners follow an average of 47 times per month. You can see cross-server flows: catalog feeds pipelines (89 transitions), incidents trigger pipeline checks (52 transitions), schema validates before quality (44 transitions).
Session Analytics. Reconstructs practitioner sessions from tool call streams. Measures engagement depth — are engineers doing 2-minute quick lookups or 25-minute deep investigations? Classifies users into power users (15+ sessions/week), regular users, and occasional users. Identifies who engages deeply and who barely touches the platform.
Usage Heatmaps. When is your team most active? Which tools dominate during incident hours versus normal operations? Hourly breakdown (peak at 2 PM, quiet at 3 AM), daily breakdown (Monday spikes, weekend lulls), and agent-by-user matrices.
Usage Anomaly Detection. Sudden drops in tool usage might mean friction or a broken workflow. Unusual spikes might mean an incident investigation or an automation loop. The agent compares recent usage against a 7-day baseline and alerts on significant deviations — with sensitivity controls so you can tune the noise level.
Activity Log. Every tool call logged in a SHA-256 hash-chained audit trail. Filter by user, agent, tool, or time range. Chain integrity verification for compliance. When something goes wrong, you can reconstruct exactly who called what tool, when, and what happened.
What This Looks Like in Practice
A data platform lead opens the adoption dashboard and sees that the governance agent, rolled out 3 weeks ago, has a 20% adoption rate — only 4 of 20 practitioners have tried it. The schema agent, by contrast, is at 80% adoption and growing 15% week-over-week. The lead schedules a training session for governance and deprioritizes investment in tools nobody uses.
An engineering manager pulls up session analytics and discovers that 5 engineers are power users averaging 9 tools per session across 3 agents, while 5 others barely touch the platform. The workflow patterns reveal that the power users follow a consistent 4-step discovery-to-deployment flow — exactly the workflow the team designed. The occasional users are only using the incident agent reactively. The manager now knows where to focus onboarding.
A VP of Data pulls the usage heatmap before a board meeting and reports: "23 active practitioners, 4,800+ tool calls per week, peak activity at 2 PM. The pipeline agent alone handles 1,240 calls per month. Weekday usage is 15x weekend usage." That is ROI data the board can see.
Why Not Just Use Amplitude or Datadog?
Product analytics tools (Amplitude, Mixpanel) track web app interactions — button clicks, page views, funnels. They do not understand MCP tool semantics. A validate_schema call is not a button click. It is a domain-specific action with a specific intent, input parameters, and outcome. Product analytics would show you "API call made" — Usage Intelligence shows you "engineer Sarah validated backward compatibility on the orders table."
APM tools (Datadog, Grafana) tell you services are healthy. They answer "is the system up?" — not "is the system useful?" A governance agent with 100% uptime but 4 users out of 20 is not delivering value. Usage Intelligence fills the gap between system health and organizational value.
The Design Decision: Deterministic, Not LLM
All 13 tools in the Usage Intelligence Agent are deterministic. Zero LLM in the processing path. Every metric is computed from aggregation, filtering, and sorting over structured data. This is not an LLM summarizing your usage — it is precise, reproducible computation that produces the same result every time.
Why? Because usage data needs to be trustworthy. If a VP is citing adoption numbers in a board meeting, those numbers cannot be hallucinated. If a compliance team is auditing the activity log, the hash chain must be verifiable. Deterministic processing is not a limitation — it is a requirement.
What We Ship
- •13 MCP tools — 7 new usage analytics tools + 6 retained agent observability tools
- •52 tests passing — full coverage across all tools, E2E workflows, anti-recursion guarantees
- •30 days of realistic seed data — 20 practitioners across dozens of MCP tools, with temporal patterns, workflow sequences, and user type distributions
- •Zero LLM dependency — all processing is deterministic aggregation over structured events
- •Fully open source — Apache 2.0, MCP-native, composable with Claude Code and any MCP client
The question is no longer "are my services healthy?" — every observability tool answers that. The question is "how is my team using our MCP tools?" That is the question that drives investment decisions, training priorities, and platform strategy. As the MCP ecosystem grows and teams connect to more servers, the need for usage visibility only increases. Usage Intelligence is how you answer it.
Related Posts
Why AI Agents Hallucinate on Your Data (And How to Fix It)
AI agents writing SQL against your data warehouse get it wrong 66% more often without semantic grounding. Here is why context is the missing layer in every data stack — and what we are building to fix it.
Copilots, Agents, and Swarms: A Decision Framework for Data Teams
The AI discourse in data engineering has collapsed into a single word: agents. Every vendor is an "agent" now. The word has lost meaning.
Our Agent Roadmap: What We've Built, What We're Building, and Why
The average data team spends 60-70% of their time on reactive maintenance. Here is how 11 specialized agents address that — and why the order we build them matters more than the features.