guideLast updated Apr 24, 20265 min read

Insights Agent Data Exploration

Data Workers' Insights Agent enables natural language data exploration across your entire data warehouse, translating business questions into SQL queries, visualizing results, and surfacing related datasets that analysts might not know exist. Data exploration is the bridge between having data and getting value from it. The Insights Agent lowers the barrier by letting analysts explore data through questions rather than SQL, while preserving the precision that SQL provides.

This guide covers the Insights Agent's natural language query capabilities, schema-aware SQL generation, result visualization, data discovery features, and strategies for making data exploration accessible to non-technical stakeholders.

The Data Exploration Gap

Most organizations have far more data than they use. The data warehouse contains thousands of tables, but analysts query the same 50 because those are the ones they know. New datasets are loaded but never explored because nobody has time to profile them. Business stakeholders have questions but cannot express them in SQL, so they wait for an analyst to become available. The exploration gap — the distance between available data and utilized data — grows wider every quarter.

The Insights Agent closes this gap in three ways. First, it translates natural language questions into SQL so non-SQL users can explore directly. Second, it surfaces related datasets that analysts might not know about based on schema similarity, naming patterns, and lineage connections. Third, it generates data profiles for new datasets automatically, making them discoverable before anyone manually explores them.

Exploration Barrier	Traditional Solution	Insights Agent Solution
SQL skill requirement	Training programs, BI tools	Natural language to SQL with follow-up refinement
Schema discovery	Browse catalog, ask colleagues	Automatic schema matching and dataset recommendation
Data profiling	Manual EDA notebooks	Auto-generated profiles with statistical summaries
Cross-dataset discovery	Tribal knowledge	Automated join discovery and cross-table analysis
Result interpretation	Analyst explanation	Auto-generated summaries with context and caveats
Exploration sharing	Copy-paste screenshots	Shareable exploration sessions with reproducible queries

Natural Language to SQL

The Insights Agent translates business questions into SQL by leveraging the catalog metadata, business glossary, and schema information maintained by the Catalog Agent. When a user asks 'What was our revenue by region last quarter?', the agent resolves 'revenue' to the correct column using the business glossary, identifies the appropriate table, applies the correct date filter for 'last quarter', and groups by the region column. The generated SQL is shown to the user for verification before execution.

The key differentiator from generic text-to-SQL tools is context awareness. The agent knows your specific schema, your business terminology, your data quality issues, and your access controls. It generates SQL that uses your organization's canonical revenue definition (not a guess), joins through the correct intermediate tables, and respects row-level security policies. This context awareness produces accurate queries that generic tools cannot match.

•Business glossary integration — resolves business terms to canonical table and column references
•Schema-aware generation — generates SQL that matches your specific warehouse dialect and schema structure
•Access control respect — generates queries that only access tables the user has permission to view
•Ambiguity resolution — asks clarifying questions when a business term maps to multiple possible interpretations
•Follow-up refinement — supports iterative exploration with 'drill into region X' or 'add last year for comparison'
•SQL explanation — provides plain-language explanation of the generated SQL for learning and verification

Dataset Discovery and Recommendation

The Insights Agent recommends related datasets during exploration sessions. When an analyst queries the orders table, the agent suggests the customer_segments table (joinable on customer_id), the product_catalog table (joinable on product_id), and the marketing_campaigns table (correlatable by date). These recommendations are based on foreign key relationships, naming conventions, query co-occurrence patterns, and lineage connections.

Discovery is especially valuable for new team members and cross-functional analysts. A marketing analyst exploring campaign performance might not know that the data warehouse contains a customer_ltv table that would enrich their analysis. The Insights Agent surfaces this connection automatically, bridging the tribal knowledge gap that slows down new hires and cross-functional work.

Automated Data Profiling

For new or unfamiliar datasets, the Insights Agent generates comprehensive profiles that include: row counts, column types, null rates, cardinality, distribution summaries, sample values, and detected patterns (dates, emails, currencies, etc.). These profiles are generated on-demand when a user first explores a table and cached for subsequent access.

Profiles also include data quality indicators: columns with high null rates are flagged, columns with suspicious distributions are highlighted, and tables with no recent updates are marked as potentially stale. These indicators help analysts assess data reliability before building analysis on top of unfamiliar datasets.

Exploration Session Management

Exploration is iterative. The Insights Agent maintains exploration sessions that track the sequence of questions, generated SQL, results, and insights discovered. Sessions can be saved, shared with colleagues, and resumed later. This capability transforms ad-hoc exploration from ephemeral query-running into documented analysis that builds organizational knowledge.

Shared exploration sessions also support collaborative analysis. An analyst can start an exploration, discover an interesting pattern, and share the session with a colleague who picks up where they left off. The session history provides full context, eliminating the 'what query did you run to get this number?' conversations that waste time in collaborative analytics.

Enabling Self-Service Analytics

The Insights Agent's exploration capabilities are a stepping stone to self-service analytics. As analysts explore data through natural language, they learn the schema, discover relationships, and build intuition about the data platform. The generated SQL serves as an educational tool: analysts can see the SQL produced from their questions and gradually learn to write their own optimized queries.

For teams building comprehensive insights capabilities, data exploration works alongside developer productivity and query optimization to provide full-spectrum platform intelligence. Book a demo to see natural language data exploration on your data warehouse.

Data exploration should not require SQL expertise or tribal knowledge. The Insights Agent translates business questions into warehouse queries, recommends related datasets, profiles unfamiliar tables, and manages exploration sessions — making the full breadth of the data warehouse accessible to everyone in the organization.

Go from data platform to
agentic platform.

With autonomous AI agents working across your entire data stack — MCP-native, open-source, deployed in minutes.

Book a Demo →

Related Resources

Creating a Data Catalog Agent with Claude Code — Learn how to create a data catalog agent with Claude Code, enhancing data management capabilities…
How to Build a Data Quality Monitoring Agent with Claude Code — Learn how to build a data quality monitoring agent using Claude Code. Enhance your data quality p…
Why One AI Agent Isn't Enough: Coordinating Agent Swarms Across Your Data Stack — A single AI agent can handle one domain. But data engineering spans 10+ domains — quality, govern…
Why Every Data Team Needs an Agent Layer (Not Just Better Tooling) — The data stack has a tool for everything — catalogs, quality, orchestration, governance. What it…
Sub-Agents and Multi-Agent Teams for Data Engineering with Claude — Claude Code spawns sub-agents in parallel — one explores schemas, another writes SQL, another val…