guide4 min read

How to Optimize BigQuery Costs: 7 Proven Tactics

How to Optimize BigQuery Costs: 7 Proven Tactics

Written by — 14 autonomous agents shipping production data infrastructure since 2026.

Technically reviewed by the Data Workers engineering team.

Last updated .

To optimize BigQuery costs: partition and cluster large tables, use SELECT only the columns you need, enable BI Engine for dashboards, switch to slot reservations for steady workloads, and monitor expensive queries. Most teams cut BigQuery bills 30–60% once they audit scanning volume.

BigQuery's per-TB-scanned pricing is a feature and a trap. It rewards good schema design and punishes lazy SQL. This guide walks through seven proven optimizations that cut costs without hurting performance.

Step 1: Partition Every Large Table

Partitioning splits a table by date (usually ingestion time or event time) so queries can prune irrelevant partitions. An unpartitioned 10 TB table scans the entire thing every query; a partitioned table scans only today's 30 GB if the query filters by date. This one change can cut costs 99% on large tables.

Partition granularity matters. Daily partitions are the sweet spot for most analytics workloads — hourly partitions explode the partition count and slow metadata operations, while monthly partitions leave too much data per scan. If your source is event-based, partition by event time, not ingestion time, so late-arriving data does not force full-table scans. And always filter queries by the partition column explicitly; BigQuery does not infer partition pruning from joins, only from WHERE clauses on the partition column.

Partition ByBest For
_PARTITIONTIMEIngestion-based, simple
DATE columnEvent time, typical analytics
Integer rangeID-based sharding
None (clustered only)Small tables or non-date access

Step 2: Cluster Large Tables

Clustering physically sorts a table by one to four columns, so queries filtering on those columns read fewer blocks. Cluster on the columns you filter on most often — customer_id, product_id, region. Clustering stacks with partitioning for compound savings.

Clustering effectiveness degrades as new data arrives out of order. BigQuery auto-reclusters periodically, but for high-ingest tables you may need to force a rewrite or schedule compaction. Monitor the clustering ratio in INFORMATION_SCHEMA — if it falls below 80%, the savings are gone and you need to re-cluster. Also, clustering only helps for queries that filter or join on the clustering columns; a query that scans by a different column sees no benefit.

Step 3: SELECT Only What You Need

BigQuery charges per column scanned. SELECT * scans every column; SELECT customer_id, revenue scans two. On wide tables with hundreds of columns, pruning columns is a 10x cost reduction. Train your team and your BI tools to select narrowly.

Many BI tools default to SELECT * behind the scenes — Tableau and Looker both do this under certain configurations. Inspect the generated SQL for a few dashboards and you may find entire unused column trees being scanned. Fix this by building narrow views explicitly for BI consumption, or by using dbt exposures to materialize only the columns each dashboard needs.

  • Avoid SELECT * — always list columns
  • Use table decorators — $LATEST to read only newest partition
  • Push down filters — WHERE early, JOIN late
  • Prefer EXISTS to IN — cheaper for existence checks
  • Materialize frequent queries — scheduled queries into tables

Step 4: Enable BI Engine for Dashboards

BI Engine caches query results in memory for Data Studio, Looker, and Tableau. Dashboards that would cost thousands per month running raw queries become free after BI Engine caches the data. Enable it on the BI-facing project.

Sizing BI Engine correctly matters. Too small and critical dashboards miss the cache; too large and you pay for unused capacity. Start by reviewing dashboard query volume for a week, then size BI Engine to cover the hot working set. Pinned tables (available in the enterprise tier) let you guarantee specific tables always live in cache — use this for the handful of tables your exec dashboards read on every refresh.

Step 5: Use Materialized Views

Materialized views pre-compute expensive aggregations and keep them updated automatically. A daily-refreshed fact table costs a fraction of running the same aggregation on every dashboard load. Use them for the top 10 most expensive repeated queries.

Materialized view costs are subtle. The view itself costs money to maintain as underlying data changes — BigQuery re-runs the aggregation incrementally, and on rapidly-changing tables the maintenance cost can exceed the query savings. Monitor the tradeoff. For tables that change once a day, materialized views almost always pay back. For streaming ingestion tables, a scheduled query materializing nightly is often cheaper than a true materialized view.

For related topics see how to optimize snowflake costs and bigquery vs snowflake.

Step 6: Slot Reservations for Steady Workloads

On-demand pricing is easy but expensive at scale. Flat-rate slot reservations (Enterprise or Enterprise Plus edition) let you pre-pay for a fixed amount of compute, and the per-query cost drops to zero. For steady workloads over $5k/month, reservations usually pay back within weeks.

Step 7: Monitor Expensive Queries

Use INFORMATION_SCHEMA.JOBS_BY_PROJECT to find the 10 most expensive queries every week. Attack the top offenders first — rewrite, cache, or prune. Data Workers cost agents automate this continuously, catching regressions the day they ship.

Book a demo to see autonomous BigQuery cost optimization in action.

Tools and Controls You Should Enable

BigQuery ships with several cost controls most teams forget to enable. Set project-level quotas on bytes processed per day — one errant query cannot burn the month's budget if the project hits its quota first. Set per-user quotas so an analyst experimenting with Python cannot melt your bill. Enable reservation assignments with flex slots for predictable burst workloads. Use BigQuery BI Engine pinned tables for dashboards that are read constantly. And turn on the built-in cost attribution via labels so every job is tagged by team, dbt model, or dashboard.

Common Mistakes

The most common BigQuery mistake is treating on-demand pricing as the default forever. Once spend crosses $5k/month, slot reservations almost always save money, but teams stay on-demand because it is easier to reason about. Second most common: SELECT * in dbt staging models, which scans every column on every run. Third: running DELETE and UPDATE against wide tables instead of using MERGE, which rewrites less data. Fourth: forgetting to partition temporary scratch tables used in CI, which can quietly cost thousands per month across a busy team.

Also watch for script jobs that log full result sets, views that re-scan base tables every call (materialize them instead), and cross-project queries that disable partition pruning when joins are complex. The BigQuery UI's bytes-processed estimate is your friend — train your team to check it before every manual query.

BigQuery cost optimization starts with partitioning and clustering, then progresses to column pruning, BI Engine, materialized views, and slot reservations. Monitor continuously so drift does not creep back. Most accounts cut spend 30–60% on the first pass — the hard part is keeping it down over time.

See Data Workers in action

15 autonomous AI agents working across your entire data stack. MCP-native, open-source, deployed in minutes.

Book a Demo

Related Resources

Explore Topic Clusters