|Documentation
dataworkers

Cursor Setup

This is the complete onboarding guide for Cursor. For a quick reference across all platforms, see the MCP Client Setup page.

Prerequisites

  • Cursor — Latest version. Cursor has well-supported MCP integration.
  • Node.js 20+ — Required to run MCP servers. Check with node --version.
  • npm — Bundled with Node.js. Used by npx to fetch and run the Data Workers server.

Create or edit .cursor/mcp.json in your project root:

{ "mcpServers": { "data-workers": { "command": "npx", "args": ["dw-claw"] } } }

This is the project-scoped configuration. It applies only to this project and can be committed to git so your entire team gets the same setup automatically.

Option 2: Configure via Cursor Settings UI

Step 1: Open Cursor and go to Settings (gear icon or Cmd+, on Mac / Ctrl+, on Windows).

Step 2: Navigate to MCP Servers in the left sidebar.

Step 3: Click Add Server.

Step 4: Enter the following:

  • Name: data-workers
  • Command: npx
  • Args: dw-claw

Step 5: Save and restart Cursor.

This is the global configuration — Data Workers will be available across all projects opened in Cursor.

Project-Scoped vs Global Configuration

Project-scoped (`.cursor/mcp.json`): Best for teams. Commit the file to git so everyone on the team gets Data Workers configured automatically when they clone the repo. Each project can have different agent configurations.

Global (Cursor Settings): Best for individual use. Applies to every project you open in Cursor without needing a config file in each repo.

If both are configured, the project-scoped configuration takes precedence.

Restart Cursor

Important: After adding or modifying the MCP configuration, you must restart Cursor for the changes to take effect. Cursor loads MCP server configurations at startup.

Verify Connection

After restarting Cursor, open the chat panel and type:

List all available Data Workers tools and show me the health of my data pipelines.

You should see tools organized by agent domain — incident, quality, schema, pipeline, catalog, governance, streaming, cost, migration, insights, usage intelligence, connectors, observability, and orchestration.

If tools do not appear, check the MCP Servers panel in Cursor Settings to confirm the server shows a connected status.

InMemory Stubs on First Run

When you first run without infrastructure credentials, the server starts with InMemory stub data. This is expected, not an error. You will see responses referencing sample datasets like analytics.orders, staging.customers, and raw.events. This confirms the server is working.

To connect to real infrastructure, set environment variables before launching Cursor. The server auto-detects real credentials and switches from stubs to live connections.

First 5 Workflows to Try

Once connected, try these prompts in the Cursor chat:

  • Incident diagnosis: "Why did my nightly ETL pipeline fail? Check logs and provide root cause analysis."
  • Data quality check: "Run a quality assessment on analytics.orders and flag anomalies."
  • Catalog search: "Search the catalog for tables related to customer revenue and show lineage."
  • Schema analysis: "Analyze the schema of staging.events and suggest performance improvements."
  • Pipeline health: "Show pipeline health across all orchestrators and highlight issues."

Individual Agent Configuration

By default, npx dw-claw starts all 15 agents. To run specific agents only, add separate entries to .cursor/mcp.json:

{ "mcpServers": { "dw-incident": { "command": "npx", "args": ["dw-claw", "--agent", "incident"] }, "dw-quality": { "command": "npx", "args": ["dw-claw", "--agent", "quality"] } } }

Available agent names: incident, quality, schema, pipeline, context, governance, streaming, cost, migration, insights, usage-intelligence, connectors, observability, swarm.

Environment Variables

Set these in your shell profile or .env file before launching Cursor:

  • SNOWFLAKE_ACCOUNT, SNOWFLAKE_USER, SNOWFLAKE_PASSWORD — Snowflake
  • GOOGLE_CLOUD_PROJECT, GOOGLE_APPLICATION_CREDENTIALS — BigQuery / Dataplex
  • DATABRICKS_HOST, DATABRICKS_TOKEN — Databricks
  • DBT_API_TOKEN, DBT_ACCOUNT_ID — dbt Cloud
  • AIRFLOW_HOST, AIRFLOW_USER, AIRFLOW_PASSWORD — Apache Airflow
  • DAGSTER_HOST, DAGSTER_TOKEN — Dagster Cloud
  • PREFECT_API_KEY, PREFECT_API_URL — Prefect Cloud

The server auto-detects which credentials are present and activates real adapters. Services without credentials fall back to InMemory stubs.

Team Setup

To share Data Workers across your team:

  • Add .cursor/mcp.json to your project root with the configuration above.
  • Commit it to git so team members get the setup automatically on clone.
  • Store credentials in environment variables or a .env file — never commit credentials to git.
  • Add .env to your .gitignore.

Each team member can also add personal agents via Cursor Settings (global) without affecting the shared project config.

Troubleshooting

Tools not appearing after configuration — Restart Cursor. MCP servers are loaded at startup, so any configuration change requires a full restart.

Config key is `mcpServers` not `mcp.servers` — Cursor uses the mcpServers key in .cursor/mcp.json. This is different from GitHub Copilot which uses mcp.servers. Using the wrong key results in silent misconfiguration.

npx cache issues — If the server does not start or uses an outdated version, run npx --yes dw-claw to force a fresh download.

Server crashes or hangs — Run npx dw-claw manually in a terminal to see the full error output. Common causes: Node.js version too old (need 20+), network issues preventing package download.

Multiple MCP servers — Cursor handles multiple MCP servers independently. Data Workers runs alongside other MCP servers without conflict.

Environment variables not picked up — Cursor may not inherit shell environment variables depending on how it was launched. Try launching Cursor from the terminal (cursor .) so it inherits your shell environment, or set variables in the Cursor MCP server configuration directly.

Still stuck? Join our Discord at discord.com/invite/b8DR5J53 or open an issue on GitHub.