Agent Reference
Data Workers includes 15 specialized agents. Each is an independent MCP server that can be enabled, disabled, and configured individually.
Integrations listed below as built-in have MCP connectors maintained by the Data Workers team. Many additional tools are supported through community-maintained or custom MCP connectors.
Incident Debugging Agent
claude mcp add data-workers-incidentMonitors data infrastructure for failures and anomalies, performs automated root cause analysis, and provides actionable remediation steps. Operates read-only by default.
- •Real-time failure detection across pipelines, queries, and data loads
- •Automated root cause analysis with cross-system correlation
- •Historical pattern matching against previously resolved incidents
- •Remediation recommendations with confidence scoring
- •Escalation to human operators when confidence is low
Integrations: Airflow, Dagster, Prefect, Snowflake, BigQuery, Databricks, Redshift, Grafana, PagerDuty, Opsgenie, New Relic, ServiceNow, Jira SM
Autonomy levels: Read-only diagnostics (default: autonomous), remediation execution (default: semi-autonomous)
Pipeline Building Agent
claude mcp add data-workers-pipelineGenerates, modifies, and deploys data pipelines based on natural language descriptions or detected requirements. Handles DAG construction, dependency management, and scheduling.
- •Generate new pipelines from natural language specifications
- •Modify existing DAGs to add sources, transformations, or destinations
- •Automated dependency resolution and scheduling optimization
- •Dry-run validation before deployment
- •Rollback support for failed deployments
Integrations: Airflow, Dagster, Prefect, dbt, Fivetran, Airbyte
Autonomy levels: Pipeline generation (default: autonomous), deployment to production (default: semi-autonomous)
Quality Monitoring Agent
claude mcp add data-workers-qualityContinuously monitors data quality across your warehouse, detects anomalies and drift, and generates quality rules based on observed data patterns. Operates read-only by default.
- •Automated anomaly detection on freshness, volume, and distribution
- •Schema drift detection and alerting
- •Quality rule generation from historical data patterns
- •Impact analysis for quality issues (downstream dependencies)
- •Trend reporting and quality scorecards
Integrations: Snowflake, BigQuery, Databricks, Redshift, Monte Carlo, dbt, Great Expectations, Soda Cloud
Autonomy levels: Monitoring and alerting (default: autonomous), quality rule enforcement (default: semi-autonomous)
Schema Evolution Agent
claude mcp add data-workers-schemaManages schema changes across your data warehouse — detects breaking changes, generates migration scripts, and coordinates schema evolution across dependent systems.
- •Breaking change detection before deployment
- •Automated migration script generation
- •Impact analysis across downstream consumers
- •Version-controlled schema history
- •Cross-system schema synchronization
Integrations: Snowflake, BigQuery, Databricks, Redshift, dbt, Alation, DataHub
Autonomy levels: Impact analysis (default: autonomous), schema migration execution (default: semi-autonomous)
Data Context & Catalog Agent
claude mcp add data-workers-contextMaintains a living data catalog by automatically documenting tables, columns, lineage, and usage patterns. Provides context to other agents and human users. Includes trust scoring, impact analysis, intent classification, and lineage visualization.
- •Automated documentation generation from data profiling
- •Column-level lineage tracking and lineage visualization
- •Trust scoring for data assets based on quality, freshness, and usage
- •Impact analysis for understanding downstream effects of changes
- •Intent classification for natural language catalog queries
- •Usage pattern analysis (who queries what, how often)
- •Business glossary management
- •Natural language search across your data catalog
Integrations: DataHub, Atlan, Alation, Snowflake, BigQuery, Databricks, Looker, Tableau
Autonomy levels: Documentation and cataloging (default: autonomous), metadata corrections (default: semi-autonomous)
Governance & Security Agent
claude mcp add data-workers-governanceEnforces data governance policies, manages access controls, detects PII, and ensures compliance with organizational and regulatory requirements.
- •Automated PII detection and classification
- •Access control policy enforcement
- •Compliance monitoring and reporting
- •Data retention policy management
- •Audit trail generation for regulatory requirements
Integrations: Snowflake, BigQuery, Databricks, DataHub, Atlan, Alation, Collibra
Autonomy levels: Policy monitoring (default: autonomous), access revocation (default: semi-autonomous)
Real-Time Streaming Agent
claude mcp add data-workers-streamingMonitors and manages real-time data streams, detects processing delays and data loss, and optimizes streaming pipeline performance.
- •Stream health monitoring (lag, throughput, error rates)
- •Consumer group management and rebalancing
- •Dead letter queue analysis and replay
- •Stream processing optimization recommendations
- •Automated alerting on stream degradation
Integrations: Kafka, Kinesis, Pulsar, Flink, Spark Streaming
Autonomy levels: Monitoring and alerting (default: autonomous), stream reconfiguration (default: semi-autonomous)
Swarm Orchestration Agent
claude mcp add data-workers-swarmCoordinates multi-agent workflows, manages task delegation between agents, and ensures coherent end-to-end issue resolution across the swarm.
- •Cross-agent workflow orchestration
- •Task prioritization and delegation
- •Conflict resolution when agents have competing recommendations
- •End-to-end progress tracking for multi-agent operations
- •Agent health monitoring and failover
Integrations: All other Data Workers agents
Autonomy levels: Coordination and routing (default: autonomous), workflow execution (default: semi-autonomous per workflow)
Cost Savings & Cleanup Agent
claude mcp add data-workers-costAnalyzes warehouse usage and spending, identifies unused tables, optimizes expensive queries, and recommends cost reduction strategies.
- •Unused table and view detection
- •Expensive query identification and optimization
- •Storage optimization recommendations
- •Warehouse sizing and scheduling analysis
- •Cost attribution by team, project, or pipeline
Integrations: Snowflake, BigQuery, Databricks, Redshift
Autonomy levels: Analysis and recommendations (default: autonomous), resource cleanup (default: semi-autonomous)
Data Migration Agent
claude mcp add data-workers-migrationPlans and executes data migrations between warehouses, databases, or environments. Handles schema translation, data validation, and cutover coordination.
- •Cross-platform schema translation
- •Automated data validation and reconciliation
- •Incremental migration with progress tracking
- •Rollback planning and execution
- •Cutover coordination with minimal downtime
Integrations: Snowflake, BigQuery, Databricks, Redshift, PostgreSQL, MySQL
Autonomy levels: Migration planning (default: autonomous), migration execution (default: semi-autonomous)
Data Science & Insights Agent
claude mcp add data-workers-insightsPerforms exploratory data analysis, generates statistical summaries, identifies trends, and provides data-driven insights to support decision making.
- •Automated exploratory data analysis
- •Statistical anomaly detection and trend identification
- •Natural language query interface for ad-hoc analysis
- •Automated report generation
- •Feature engineering suggestions for ML workflows
Integrations: Snowflake, BigQuery, Databricks, Jupyter, Looker, Tableau
Autonomy levels: Analysis and reporting (default: autonomous), data modifications (default: semi-autonomous)
Usage Intelligence Agent
claude mcp add data-workers-usage-intelligenceSee how your data team actually works. Tracks practitioner usage patterns across every MCP tool, detects workflow patterns, measures adoption, and provides full agent observability with audit trails and drift detection.
- •Tool usage metrics: volume, unique users, and trends per tool and agent
- •Workflow pattern detection: common multi-agent tool sequences
- •Adoption dashboards: which agents are fully adopted vs. shelfware
- •Usage anomaly detection: drops, spikes, and behavior shifts
- •Session analytics: engagement depth and power user identification
- •Usage heatmaps: activity by hour, day, and agent
- •Agent health monitoring: SHA-256 audit trails, drift detection, health checks
Integrations: All other Data Workers agents, Grafana, Datadog
Autonomy levels: Monitoring and reporting (default: autonomous), agent reconfiguration (default: semi-autonomous)
Data Connectors Agent
claude mcp add data-workers-connectorsUnified access to 40+ data platforms and enterprise tools. Catalog discovery across Snowflake, BigQuery, Databricks, AWS Glue, Hive Metastore, OpenMetadata, DataHub, Purview, Dataplex, Nessie, and more. Cross-catalog search with capability negotiation.
- •Connect to 40+ data platforms through a single interface
- •Cross-catalog search and discovery across multiple data sources
- •Capability negotiation — automatically adapts to each platform's features
- •Unified metadata access regardless of underlying platform
- •Automatic credential management and connection pooling
Integrations: Snowflake, BigQuery, Databricks, AWS Glue, Hive Metastore, OpenMetadata, DataHub, Azure Purview, Google Dataplex, Apache Nessie, and 30+ more
Autonomy levels: Discovery and search (default: autonomous), connection management (default: semi-autonomous)
Observability Agent
claude mcp add data-workers-observabilityAgent health monitoring, audit trails, drift detection, and performance metrics. Full observability into agent behavior for enterprise compliance.
- •Agent health monitoring with real-time status dashboards
- •Immutable audit trails for every agent action and decision
- •Drift detection — alerts when agent behavior deviates from baselines
- •Performance metrics and latency tracking across all agents
- •Compliance reporting for enterprise governance requirements
Integrations: All other Data Workers agents, Grafana, Datadog, New Relic
Autonomy levels: Monitoring and alerting (default: autonomous), agent reconfiguration (default: semi-autonomous)
ML & Data Science Agent
claude mcp add data-workers-mlAssists with machine learning workflows — feature engineering, model training pipelines, experiment tracking, and model deployment. Bridges the gap between data engineering and data science.
- •Automated feature engineering from warehouse tables
- •ML pipeline generation and orchestration
- •Experiment tracking and model versioning
- •Model performance monitoring and drift detection
- •Integration with feature stores and model registries
Integrations: Snowflake, BigQuery, Databricks, MLflow, Weights & Biases, SageMaker, Vertex AI
Autonomy levels: Analysis and recommendations (default: autonomous), model deployment (default: semi-autonomous)