Mcp Server Bigquery Production Setup
Mcp Server Bigquery Production Setup
Written by The Data Workers Team — 14 autonomous agents shipping production data infrastructure since 2026.
Technically reviewed by the Data Workers engineering team.
Last updated .
Productionizing an MCP server against BigQuery means scoping a service account to a dedicated project, capping bytes-billed per query, and routing every statement through a named job label for audit. Once those three things are in place an agent can safely run analytics on a warehouse that holds tens of petabytes.
BigQuery's serverless model means there is no cluster to size, but it also means every query can scan a petabyte by accident. This guide covers the production setup: IAM, quota, cost controls, row-level security, and the observability patterns you need so an agent does not burn a month of budget before lunch.
Why BigQuery MCP Needs Guardrails
BigQuery charges per byte scanned, not per warehouse hour. That is great for most workloads but catastrophic for an agent that decides to run SELECT * FROM events on a 50 TB table. The production answer is to put a hard cap on bytes-billed per query and require every agent query to be tagged with a job label so cost attribution is possible after the fact.
The second concern is blast radius. BigQuery projects often contain customer PII, finance data, and product telemetry in the same dataset. Give the MCP server access only to the curated analytics views that your data team has already blessed, and do not let it roam the raw export datasets.
Service Account Scoping
Create a service account dedicated to the MCP server (mcp-agent@project.iam.gserviceaccount.com). Grant it bigquery.dataViewer on the curated datasets only, plus bigquery.jobUser at the project level so it can run queries. Do not grant bigquery.admin and do not reuse a human account or an existing pipeline service account.
- •Dedicated service account — per MCP server, not shared
- •Dataset-scoped grants — curated schemas only
- •Workload Identity Federation — avoid long-lived keys
- •Organization policy — block data exfil to unknown projects
- •Labels enforced — every job tagged with agent ID
Cost Caps with maximum_bytes_billed
BigQuery exposes a per-query maximum_bytes_billed setting that hard-stops any statement exceeding the limit. Set it at the connection level so every agent query inherits the cap. A sensible default is 100 GB for general analytics, with a higher ceiling for specific tables the data team pre-approves. A runaway agent hits the cap and errors out cheaply instead of burning the budget.
| Guardrail | Setting | Purpose |
|---|---|---|
| Max bytes per query | 100 GB | Hard stop on accidental full scans |
| Max bytes per day | 10 TB | Daily budget ceiling |
| Query timeout | 120 seconds | Kills runaway statements |
| Result size | 10 MB streamed | Prevents huge payloads |
| Concurrent queries | 5 | Avoids queue starvation |
| Labels required | agent, session | Cost attribution |
Row-Level Security and PII
BigQuery row access policies and column-level security let you mask PII before the agent ever sees it. Apply a masking policy to columns like email, phone, and SSN so the MCP server returns hashed or redacted values. Row access policies restrict which rows the service account can see based on tags, so you can partition access by region or tenant without forking datasets.
Observability with INFORMATION_SCHEMA
BigQuery's INFORMATION_SCHEMA.JOBS_BY_PROJECT view gives you the full query log, including bytes billed, duration, and labels. Join it against the agent's audit trail to reconstruct any session, then ship the joined stream to BigQuery or your SIEM of choice. This gives you a complete record of what the agent did, what it cost, and who asked.
Data Workers on BigQuery
Data Workers' BigQuery connector wires all the production settings automatically: dedicated service account, bytes-billed cap, job labels, and policy enforcement. The catalog agent discovers tables, the cost agent flags runaway spend, and the governance agent enforces PII policies across datasets. See AI for data infrastructure for the full stack, or read MCP server Snowflake production setup for the Snowflake equivalent.
To see a hardened BigQuery MCP setup running in production, book a demo. We will walk through IAM scoping, bytes-billed caps, and cost attribution labels on a real project.
Beyond the core settings, a production BigQuery MCP server benefits from query preview mode. BigQuery lets you run a query in dry-run mode that returns the bytes it would scan without actually running it. The MCP server should expose this as a separate tool so the agent can check cost before committing to an expensive query. A careful agent uses dry-run on every query over 1 GB and warns the user before running it for real.
Workload management in BigQuery is handled through reservations. If your organization already uses flat-rate pricing with reservations, create an agents reservation with a small slot count and assign the MCP service account to it. Agent queries then share a fixed slice of compute instead of competing for on-demand capacity. For on-demand billing, the equivalent is tight bytes-billed limits plus per-project quotas set via the BigQuery admin console.
The surprising benefit of a hardened BigQuery MCP setup is how much easier it makes incident response. When something goes wrong, the team can filter INFORMATION_SCHEMA.JOBS_BY_PROJECT on the agent label and see every query the agent ran, the cost of each, the user who triggered it, and the exact SQL. That level of observability rivals what most human workflows have, and it is the reason regulated customers approve BigQuery MCP deployments more readily than many other backends.
BigQuery MCP is safe and cheap when you scope the service account tightly, cap bytes billed per query, and tag every job. Skip any one of the three and the next runaway query will show up on the CFO's monthly bill review.
Sources
See Data Workers in action
15 autonomous AI agents working across your entire data stack. MCP-native, open-source, deployed in minutes.
Book a DemoRelated Resources
- Mcp Server Snowflake Production Setup — Mcp Server Snowflake Production Setup
- MCP Server for BigQuery: Give AI Agents Access to Your Analytics — BigQuery's MCP server gives AI agents access to schemas, query execution, and cost estimation. Here's how to connect it and use Data Work…
- Mcp Server Postgres Production — Mcp Server Postgres Production
- Mcp Server Redshift Setup — Mcp Server Redshift Setup
- Why AI Agents Need MCP Servers for Data Engineering — MCP servers give AI agents structured access to your data tools — Snowflake, BigQuery, dbt, Airflow, and more. Here is why MCP is the int…
- MCP Server Analytics: Understanding How Your AI Tools Are Actually Used — Your team uses dozens of MCP tools every day. MCP analytics tracks adoption, measures ROI, identifies unused tools, and provides the usag…
- How to Build an MCP Server for Your Data Warehouse (Tutorial) — MCP servers give AI agents structured access to your data warehouse. This tutorial walks through building one from scratch — TypeScript,…
- MCP Server Security: Authentication, Authorization, and Audit Trails — MCP servers expose powerful capabilities to AI agents. Securing them requires OAuth 2.1 authentication, scoped authorization, least-privi…
- MCP Server for Snowflake: Connect AI Agents to Your Data Warehouse — Snowflake's MCP server exposes Cortex Analyst, Cortex Search, and schema metadata to AI agents. Here's how to set it up and how Data Work…
- MCP Server Tutorial: Build a Data Warehouse Integration in 30 Minutes (Python) — Build an MCP server for your data warehouse in 30 minutes with Python. Step-by-step tutorial covering schema exposure, query execution, a…
- MCP Server for Databases: Connect AI Agents to Postgres, BigQuery, and Snowflake — Connect AI agents to Postgres, BigQuery, and Snowflake via MCP servers. Database-specific patterns, schema exposure, and query execution.
- Remote MCP Servers: Deploy AI Tool Integrations to Production — Remote MCP servers move AI tool integrations from local development to production — with OAuth authentication, mTLS security, Kubernetes…
Explore Topic Clusters
- Data Governance: The Complete Guide — Policies, access controls, PII, and compliance at scale.
- Data Catalog: The Complete Guide — Discovery, metadata, lineage, and the modern catalog stack.
- Data Lineage: The Complete Guide — Column-level lineage, impact analysis, and observability.
- Data Quality: The Complete Guide — Tests, SLAs, anomaly detection, and data reliability engineering.
- AI Data Engineering: The Complete Guide — LLMs, agents, and autonomous workflows across the data stack.
- MCP for Data: The Complete Guide — Model Context Protocol servers, tools, and agent integration.
- Data Mesh & Data Fabric: The Complete Guide — Federated ownership, domain-oriented architecture, and interop.
- Open-Source Data Stack: The Complete Guide — dbt, Airflow, Iceberg, DuckDB, and the modern OSS toolkit.
- AI for Data Infra — The complete category for AI agents built specifically for data engineering, data governance, and data infrastructure work.