Engineering8 min read

What Joe Reis's Data Engineering Lifecycle Taught Our Connectors Agent

How a deceptively simple framework — lifecycle stages plus undercurrents — became the engineering backbone of how our connectors agent handles every source system interaction.

By The Data Workers Team

Most data engineering conversations start with tools. Which orchestrator? Which ingestion framework? Which catalog? Joe Reis has spent the better part of a decade arguing that this is the wrong starting point. The question is not which tool — it is which stage of the lifecycle you are in, and which undercurrents apply to it.

Reis is the co-author of Fundamentals of Data Engineering (O'Reilly, 2022, with Matt Housley), a book that has become required reading in data engineering teams worldwide. He also writes at joereis.substack.com and joereis.spicytakes.org, where he applies the same first-principles discipline to the industry's recurring hype cycles.

The Framework: Lifecycle and Undercurrents

Reis's central framework has two parts. First, the data engineering lifecycle: source systems generate data, which is ingested, stored, transformed, and served. These five stages are, as he puts it, 'the orbit everything falls back into.' Second, the undercurrents — security, data management, DataOps, data architecture, orchestration, and software engineering — apply across every stage.

His clearest articulation of why this matters comes from a conversation with his mentor Bill Inmon: 'Fundamentals are gravity... And in the end, gravity wins.' The analogy is precise: you can try to escape the fundamentals with new tools and hype cycles, but the lifecycle and its undercurrents do not care about your stack preferences.

What Is Actually Worth Learning

  • Know the lifecycle stage before touching a tool. Is this a Generation problem or an Ingestion problem? They have different failure modes.
  • Apply the undercurrents at each stage, not at the end. Security, lineage, and orchestration checks deferred until after ingestion compound into governance debt that rarely gets paid back.
  • Choose the ingestion pattern from source characteristics, not from defaults. A high-change OLTP table calls for CDC; a slow-moving reference table calls for snapshot.
  • A connector that moved bytes but skipped the undercurrents has completed the plumbing task, not the engineering task.
  • The lifecycle itself is Lindy: 'The title and work of a data engineer might change, but the data engineering lifecycle will take far longer to evolve.'

How a Method Becomes a Skill

The lifecycle-undercurrents-ingestion skill runs in six steps. Before any pull begins, the agent names the lifecycle stage and checks the connector's health. It then inspects the source schema from the live system. Only then does it apply the undercurrents: credential scope for security, catalog registration check for data management, SLA window check for DataOps, and upstream DAG status for orchestration. After ingestion, it emits a lineage event before marking the task complete. Finally, it returns a lifecycle coverage report.

One of More Than 400

Data Workers runs more than 400 method-named skills across 19 specialized agents. The lifecycle-undercurrents-ingestion skill joins skills derived from work on dimensional modeling, streaming architecture, blameless postmortems, and a growing library of engineering methods that have proven durable across hype cycles.

A note on this post: This is independent commentary and homage. It distills publicly available writing and talks by Joe Reis to illustrate a working method, and every quote is drawn from and verified against the primary sources linked above. The skill it describes is named for the method, not the person, and contains no marketing claims attributed to them. Data Workers is not affiliated with, sponsored by, or endorsed by Joe Reis. If you are Joe Reis and would like anything adjusted or removed, email hello@dataworkers.io and we will respond promptly.

Related Posts