Configuration
This document covers how to configure Data Workers agents, autonomy levels, approval workflows, and operational settings.
Where Configuration Lives
Agent configuration can be managed in two ways:
- •Config file: Located at
~/.dataworkers/config.yaml(or set via theDATAWORKERS_CONFIGenvironment variable). Recommended for local development and CI/CD. - •Admin API: Use the
POST /api/v1/configendpoint to read and update configuration programmatically. Recommended for production deployments managed through infrastructure-as-code.
Changes made via the admin API take effect within 30 seconds. Config file changes are picked up on agent restart.
Autonomy Levels
Every agent supports three autonomy levels, configurable per agent and per operation:
Fully Autonomous — The agent executes without human intervention. Use for high-confidence, low-risk operations like monitoring, alerting, and documentation updates.
Semi-Autonomous — The agent proposes an action and waits for human approval before executing. Use for operations that change production state — schema migrations, pipeline deployments, access control changes.
Advisory Only — The agent analyzes and suggests, but does not execute. The human decides and performs the action. Use for initial deployments or highly sensitive environments.
Example configuration in config.yaml:
agents.incident-debugging.diagnostics: autonomousagents.incident-debugging.remediation: semi-autonomousagents.schema-evolution.impact-analysis: autonomousagents.schema-evolution.migration-execution: semi-autonomousApproval Workflows
Define which operations require human approval and how approvals are routed:
- •Approval channels: Slack, email, or the Data Workers dashboard
- •Approval roles: Define who can approve which operation types
- •Timeout behavior: Configure what happens if approval is not received within a time window (default: operation is cancelled)
- •Confidence thresholds: Set a confidence score above which operations auto-execute and below which they require approval
Quality & Safety
Confidence thresholds: Each agent reports a confidence score with its recommendations. Configure minimum thresholds for autonomous execution.
Cost guards: Set limits on expensive operations. For example, prevent warehouse queries estimated to cost above a configurable threshold without approval.
Dry-run mode: Test agent behavior without executing any changes. Agents report what they would do without doing it. Useful for initial setup and validation.
Notifications
Configure how agents communicate with your team:
- •Slack: Real-time alerts, approval requests, incident summaries
- •Email: Periodic reports, incident notifications, compliance summaries
- •PagerDuty: Critical alerts that require immediate human attention
Environment Variables
Key environment variables used for agent configuration and data infrastructure connections:
- •
SNOWFLAKE_ACCOUNT— Snowflake account identifier - •
SNOWFLAKE_USER— Snowflake username - •
SNOWFLAKE_PASSWORD— Snowflake password - •
DATAWORKERS_CONFIG— Path to config file (default:~/.dataworkers/config.yaml) - •
DATAWORKERS_TOKEN— OAuth bearer token for API access
Set these in your shell or .env file before starting Claude Code. See Getting Started for initial setup instructions.
Troubleshooting
"Agent not found" when adding via Claude Code — The agent name must match exactly (e.g., data-workers-incident, not dataworkers-incident). Run claude mcp list to see currently connected agents.
"Authentication failed" on agent startup — Ensure your credentials are set correctly. For environment variable-based auth, verify the variables are exported in the shell where Claude Code is running. For OAuth, re-run the authorization flow — tokens may have expired.
"Config validation error" on API update — The admin API validates config on write. Common causes: missing required fields, invalid confidence-threshold values (must be between 0 and 1), or YAML syntax errors in the config file.