Getting Started

Name: Dataworkers
Availability: OnlineOnly
Author: Dataworkers

Estimated time: 5–10 minutes to connect your first agent.

This guide walks you through connecting your first Data Workers agent and running a sample diagnostic.

Prerequisites

•A data warehouse: Snowflake, BigQuery, Databricks, or Redshift
•An orchestrator (optional but recommended): Airflow, Dagster, or Prefect
•Claude Code installed (latest version recommended)

No additional infrastructure is required for SaaS deployments. Agents connect to your existing tools via MCP. See the Enterprise Guide for VPC and on-premise deployment options.

Community Edition (free): All 15 agents operate in read-only mode — diagnostics, analysis, discovery, and recommendations. Write operations (deploying pipelines, applying schema changes, triggering workflows) require the Pro tier. See the Licensing & Tiers page for details. The full source code is available at github.com/DataWorkersProject/dataworkers-claw-community.

Quick Start (Claude Code)

The fastest way to get started is with a single command:

npx dw-claw

This launches the full Data Workers MCP server with all 15 agents and 155+ tools. It works with Claude Code, Cursor, and any MCP client.

Option 2: Register with Claude Code

To register Data Workers as a persistent MCP server in Claude Code:

claude mcp add data-workers -- npx dw-claw

Once registered, Claude Code auto-discovers the tools each agent exposes. Agents work independently or as a coordinated swarm — you choose which to enable.

You do not need to configure inter-agent communication. The agents share context through a shared context layer and coordinate automatically when multiple agents are active.

Fine-grained control

If you only need specific agents rather than the full swarm, you can add them individually:

claude mcp add data-workers-incident

claude mcp add data-workers-quality

claude mcp add data-workers-schema

claude mcp add data-workers-context

claude mcp add data-workers-pipeline

claude mcp add data-workers-governance

See the Agent Reference for the complete list of 15 specialized agents.

Quick Start (Cursor)

Add Data Workers agents to Cursor by adding them to your MCP configuration.

Create or edit .cursor/mcp.json in your project root with your agent configurations. Add entries for each agent you want to use (data-workers-incident, data-workers-quality, data-workers-schema, etc.) using npx as the command.

Alternatively, add servers through Cursor Settings > MCP Servers. Restart Cursor after adding the configuration.

Quick Start (OpenClaw / Other MCP Clients)

Data Workers agents are standard MCP servers and work with any MCP-compatible client including OpenClaw, Cline, Continue, and custom implementations.

Install and run any agent server with npx data-workers-incident (or any agent name). Connect using your client's MCP configuration via stdio transport. Each agent runs as a standalone process.

For detailed setup instructions across all supported MCP clients — including Claude Code, Cursor, GitHub Copilot, Microsoft Copilot, OpenClaw, Cline, and Continue — see the MCP Client Setup guide. That page covers the universal npx dw-claw command, per-client configuration differences, individual agent selection, and troubleshooting.

Configure Connections

After adding agents, configure credentials so they can reach your data infrastructure:

•Environment variables: Set connection credentials (e.g., SNOWFLAKE_ACCOUNT, SNOWFLAKE_USER, SNOWFLAKE_PASSWORD) in your shell or .env file before starting Claude Code.
•OAuth: For tools that support OAuth (e.g., BigQuery, Databricks), the agent will prompt you through a browser-based authorization flow on first connection.
•API keys: Some integrations (e.g., PagerDuty, Grafana) require an API key, which you can set via environment variable or pass during agent setup.

Refer to each agent's integration list in the Agent Reference for tool-specific credential requirements.

Quick Start (API / Standalone)

For environments outside Claude Code:

•REST API: Each agent exposes a REST API for programmatic access. See the Developer Guide for endpoint documentation.
•Kubernetes: Deploy agents as containerized services in your cluster. Helm charts are planned for standard deployments.

First Steps

1. Connect Your First Agent

We recommend starting with Incident Debugging. It operates in read-only mode by default — it observes your infrastructure, diagnoses issues, and reports findings without making changes. Low risk, high signal.

claude mcp add data-workers-incident

2. Verify the Connection

After adding the agent, verify it is connected and can reach your data infrastructure. Type the following in the Claude Code chat (this is a natural language prompt, not a shell command):

List available data-workers-incident tools

You should see tools for log analysis, query diagnostics, pipeline health checks, and root cause analysis.

3. Run a Sample Diagnostic

Ask the agent to analyze a recent issue by typing in the Claude Code chat:

Check the health of my data pipelines and report any recent failures

The agent will connect to your orchestrator (Airflow, Dagster, Prefect), pull recent run history, and provide a diagnostic summary.

Troubleshooting

"Agent not found" when running `claude mcp add` — Ensure you are running the latest version of Claude Code. Older versions may not support the Data Workers MCP servers. Update Claude Code and try again.

"Connection refused" when agent tries to reach your data infrastructure — Verify that your credentials are correct and that your network allows outbound connections to the relevant services. Check environment variables, firewall rules, and VPN connectivity.

No tools discovered after adding an agent — This usually means the MCP server is not running or not reachable. Restart Claude Code, re-add the agent with claude mcp add, and verify the MCP server process is active. Run claude mcp list to check connection status.

Architecture Overview