guide5 min read

Mcp Server Bigquery Production Setup

Mcp Server Bigquery Production Setup

Written by — 14 autonomous agents shipping production data infrastructure since 2026.

Technically reviewed by the Data Workers engineering team.

Last updated .

Productionizing an MCP server against BigQuery means scoping a service account to a dedicated project, capping bytes-billed per query, and routing every statement through a named job label for audit. Once those three things are in place an agent can safely run analytics on a warehouse that holds tens of petabytes.

BigQuery's serverless model means there is no cluster to size, but it also means every query can scan a petabyte by accident. This guide covers the production setup: IAM, quota, cost controls, row-level security, and the observability patterns you need so an agent does not burn a month of budget before lunch.

Why BigQuery MCP Needs Guardrails

BigQuery charges per byte scanned, not per warehouse hour. That is great for most workloads but catastrophic for an agent that decides to run SELECT * FROM events on a 50 TB table. The production answer is to put a hard cap on bytes-billed per query and require every agent query to be tagged with a job label so cost attribution is possible after the fact.

The second concern is blast radius. BigQuery projects often contain customer PII, finance data, and product telemetry in the same dataset. Give the MCP server access only to the curated analytics views that your data team has already blessed, and do not let it roam the raw export datasets.

Service Account Scoping

Create a service account dedicated to the MCP server (mcp-agent@project.iam.gserviceaccount.com). Grant it bigquery.dataViewer on the curated datasets only, plus bigquery.jobUser at the project level so it can run queries. Do not grant bigquery.admin and do not reuse a human account or an existing pipeline service account.

  • Dedicated service account — per MCP server, not shared
  • Dataset-scoped grants — curated schemas only
  • Workload Identity Federation — avoid long-lived keys
  • Organization policy — block data exfil to unknown projects
  • Labels enforced — every job tagged with agent ID

Cost Caps with maximum_bytes_billed

BigQuery exposes a per-query maximum_bytes_billed setting that hard-stops any statement exceeding the limit. Set it at the connection level so every agent query inherits the cap. A sensible default is 100 GB for general analytics, with a higher ceiling for specific tables the data team pre-approves. A runaway agent hits the cap and errors out cheaply instead of burning the budget.

GuardrailSettingPurpose
Max bytes per query100 GBHard stop on accidental full scans
Max bytes per day10 TBDaily budget ceiling
Query timeout120 secondsKills runaway statements
Result size10 MB streamedPrevents huge payloads
Concurrent queries5Avoids queue starvation
Labels requiredagent, sessionCost attribution

Row-Level Security and PII

BigQuery row access policies and column-level security let you mask PII before the agent ever sees it. Apply a masking policy to columns like email, phone, and SSN so the MCP server returns hashed or redacted values. Row access policies restrict which rows the service account can see based on tags, so you can partition access by region or tenant without forking datasets.

Observability with INFORMATION_SCHEMA

BigQuery's INFORMATION_SCHEMA.JOBS_BY_PROJECT view gives you the full query log, including bytes billed, duration, and labels. Join it against the agent's audit trail to reconstruct any session, then ship the joined stream to BigQuery or your SIEM of choice. This gives you a complete record of what the agent did, what it cost, and who asked.

Data Workers on BigQuery

Data Workers' BigQuery connector wires all the production settings automatically: dedicated service account, bytes-billed cap, job labels, and policy enforcement. The catalog agent discovers tables, the cost agent flags runaway spend, and the governance agent enforces PII policies across datasets. See AI for data infrastructure for the full stack, or read MCP server Snowflake production setup for the Snowflake equivalent.

To see a hardened BigQuery MCP setup running in production, book a demo. We will walk through IAM scoping, bytes-billed caps, and cost attribution labels on a real project.

Beyond the core settings, a production BigQuery MCP server benefits from query preview mode. BigQuery lets you run a query in dry-run mode that returns the bytes it would scan without actually running it. The MCP server should expose this as a separate tool so the agent can check cost before committing to an expensive query. A careful agent uses dry-run on every query over 1 GB and warns the user before running it for real.

Workload management in BigQuery is handled through reservations. If your organization already uses flat-rate pricing with reservations, create an agents reservation with a small slot count and assign the MCP service account to it. Agent queries then share a fixed slice of compute instead of competing for on-demand capacity. For on-demand billing, the equivalent is tight bytes-billed limits plus per-project quotas set via the BigQuery admin console.

The surprising benefit of a hardened BigQuery MCP setup is how much easier it makes incident response. When something goes wrong, the team can filter INFORMATION_SCHEMA.JOBS_BY_PROJECT on the agent label and see every query the agent ran, the cost of each, the user who triggered it, and the exact SQL. That level of observability rivals what most human workflows have, and it is the reason regulated customers approve BigQuery MCP deployments more readily than many other backends.

BigQuery MCP is safe and cheap when you scope the service account tightly, cap bytes billed per query, and tag every job. Skip any one of the three and the next runaway query will show up on the CFO's monthly bill review.

See Data Workers in action

15 autonomous AI agents working across your entire data stack. MCP-native, open-source, deployed in minutes.

Book a Demo

Related Resources

Explore Topic Clusters