guide5 min read

Ai Agent Coding Mistakes Data

Ai Agent Coding Mistakes Data

AI agents make systematic coding mistakes in data work: wrong dialect, fabricated columns, missing joins, unsafe DDL. These are not random bugs — they are predictable failure modes rooted in how LLMs generalize across SQL dialects and how they fill gaps in schema knowledge. The fix is catalog grounding plus dialect pinning plus dry-run validation.

This guide catalogs the six most common AI coding mistakes in data engineering, explains the root cause of each, and walks through the grounding pattern that prevents them in production.

Why Agents Write Broken SQL

An LLM writing SQL against a warehouse it has never seen will hallucinate column names, invent join keys, and mix dialect features from different engines. The model is optimizing for plausibility, not executability. Without access to the actual catalog, it has no way to know whether the query will run — only whether it looks like it should.

The Six Mistakes

  • Dialect drift — Snowflake-specific functions in a BigQuery query, or vice versa
  • Fabricated columns — using column names that sound right but do not exist
  • Missing joins — dropping a required join and silently producing wrong results
  • Unsafe DDL — DROP or TRUNCATE statements with no confirmation guardrail
  • Wrong case sensitivity — Snowflake's upper-case default vs Postgres lower-case
  • Type coercion errors — comparing strings to ints and getting silent casts

Catalog Grounding

Data Workers agents read the catalog before writing any SQL. Tables, columns, types, and relationships come from the real catalog, not from the model's memory. This eliminates fabricated columns and missing joins because the agent only has access to real schema metadata.

Dialect Pinning

The agent knows which warehouse it is targeting (Snowflake, BigQuery, Redshift, Postgres, DuckDB) and uses a dialect-specific prompt template. Functions are validated against the target dialect's documentation, not the model's general SQL knowledge. Dialect drift becomes impossible because the agent cannot emit a function that is not in the approved list.

Dry-Run Validation

Every query the agent proposes runs through a dry-run validator before it touches real data. For BigQuery and Snowflake this means EXPLAIN. For Postgres it means a transaction rollback. For dbt it means compilation without execution. If the query does not dry-run cleanly, the agent rewrites it rather than executing a broken version.

Unsafe DDL Guardrails

DROP, TRUNCATE, DELETE without WHERE, and ALTER TABLE on production tables all trigger hard stops. The agent can propose them but cannot execute them without explicit human approval. See how this plays into autonomous data engineering and the role-based access layer.

Regression Tests for Agent SQL

Data Workers ships a golden query eval suite that runs every proposed agent query against a reference warehouse and compares results. Regressions get caught at CI, not in production. This is how we know the agent is not silently degrading across model updates. See AI for data infrastructure for how evals fit into the broader operational model.

The Human-in-the-Loop Layer

Even with grounding, dialect pinning, and dry-run validation, some queries deserve human review. Anything that writes to a production table, modifies schema, or joins tables from different tenants passes through an approval gate. Humans see the proposed query, the dry-run result, and the estimated cost before approving.

Agent SQL is only as good as its grounding. Without catalog access, dialect pinning, and dry-run validation, you get plausible-looking broken queries. To see the full grounding pipeline in production, book a demo.

A subtle failure mode: the agent writes SQL that is syntactically correct and returns results, but the results are wrong because the agent joined on the wrong key. For example, joining orders to users on user_id is obvious; joining orders to users on billing_address_id because both tables have that field is not, and the resulting numbers will be subtly wrong. Catalog grounding alone is not enough here — you also need relationship metadata that tells the agent which joins are canonical and which are accidental. Data Workers' catalog captures canonical join paths so the agent never guesses a wrong key.

Type coercion errors in SQL are quiet killers. An agent compares a string column to an integer literal, the database silently casts, the comparison returns a valid boolean, and the query produces wrong results without any error. The only defense is strict type checking at the validation layer. Dry-runs catch some of these; explicit type assertions catch the rest. Teams that add type assertions to their agent SQL pipelines typically cut 'wrong result' incidents by half.

Boolean logic errors are the third-most-common agent SQL bug after dialect drift and fabricated columns. An agent writes WHERE status = 'active' OR deleted_at IS NULL when the intended logic was AND. Both queries compile and run, both return results, and the wrong query returns silently incorrect data. The fix is to test agent-generated queries against expected-result fixtures from the catalog, not just syntactic validation. Integration tests catch logic bugs that syntactic tests miss.

Agent-generated migration scripts deserve extra scrutiny. A migration that the agent wrote correctly against the dev warehouse may fail against production if the production schema has diverged. Always dry-run migrations in a staging clone before promoting. Data Workers ships a migration rehearsal workflow that runs every agent-generated migration against a production clone and reports differences before allowing promotion. This pattern has caught dozens of potential production incidents in our own deployments.

Catalog grounding kills fabricated columns. Dialect pinning kills drift. Dry-run validation kills broken queries. Approval gates kill unsafe DDL.

See Data Workers in action

15 autonomous AI agents working across your entire data stack. MCP-native, open-source, deployed in minutes.

Book a Demo

Related Resources

Explore Topic Clusters