What Joe Reis's Data Engineering Lifecycle Taught Our Connectors Agent
How a deceptively simple framework — lifecycle stages plus undercurrents — became the engineering backbone of how our connectors agent handles every source system interaction.
By The Data Workers Team
Most data engineering conversations start with tools. Which orchestrator? Which ingestion framework? Which catalog? Joe Reis has spent the better part of a decade arguing that this is the wrong starting point. The question is not which tool — it is which stage of the lifecycle you are in, and which undercurrents apply to it.
Reis is the co-author of Fundamentals of Data Engineering (O'Reilly, 2022, with Matt Housley), a book that has become required reading in data engineering teams worldwide. He also writes at joereis.substack.com and joereis.spicytakes.org, where he applies the same first-principles discipline to the industry's recurring hype cycles.
The Framework: Lifecycle and Undercurrents
Reis's central framework has two parts. First, the data engineering lifecycle: source systems generate data, which is ingested, stored, transformed, and served. These five stages are, as he puts it, 'the orbit everything falls back into.' Second, the undercurrents — security, data management, DataOps, data architecture, orchestration, and software engineering — apply across every stage.
His clearest articulation of why this matters comes from a conversation with his mentor Bill Inmon: 'Fundamentals are gravity... And in the end, gravity wins.' The analogy is precise: you can try to escape the fundamentals with new tools and hype cycles, but the lifecycle and its undercurrents do not care about your stack preferences.
What Is Actually Worth Learning
- •Know the lifecycle stage before touching a tool. Is this a Generation problem or an Ingestion problem? They have different failure modes.
- •Apply the undercurrents at each stage, not at the end. Security, lineage, and orchestration checks deferred until after ingestion compound into governance debt that rarely gets paid back.
- •Choose the ingestion pattern from source characteristics, not from defaults. A high-change OLTP table calls for CDC; a slow-moving reference table calls for snapshot.
- •A connector that moved bytes but skipped the undercurrents has completed the plumbing task, not the engineering task.
- •The lifecycle itself is Lindy: 'The title and work of a data engineer might change, but the data engineering lifecycle will take far longer to evolve.'
How a Method Becomes a Skill
The lifecycle-undercurrents-ingestion skill runs in six steps. Before any pull begins, the agent names the lifecycle stage and checks the connector's health. It then inspects the source schema from the live system. Only then does it apply the undercurrents: credential scope for security, catalog registration check for data management, SLA window check for DataOps, and upstream DAG status for orchestration. After ingestion, it emits a lineage event before marking the task complete. Finally, it returns a lifecycle coverage report.
One of More Than 400
Data Workers runs more than 400 method-named skills across 19 specialized agents. The lifecycle-undercurrents-ingestion skill joins skills derived from work on dimensional modeling, streaming architecture, blameless postmortems, and a growing library of engineering methods that have proven durable across hype cycles.
A note on this post: This is independent commentary and homage. It distills publicly available writing and talks by Joe Reis to illustrate a working method, and every quote is drawn from and verified against the primary sources linked above. The skill it describes is named for the method, not the person, and contains no marketing claims attributed to them. Data Workers is not affiliated with, sponsored by, or endorsed by Joe Reis. If you are Joe Reis and would like anything adjusted or removed, email hello@dataworkers.io and we will respond promptly.
Related Posts
What Ralph Kimball's Dimensional Modeling Taught Our Pipelines Agent
Ralph Kimball's four-step dimensional design process is one of the most durable ideas in data engineering — here is what it taught our pipelines agent.
What Jay Kreps's Log-Centric Architecture Taught Our Streaming Agent
Jay Kreps's core insight is deceptively simple: an append-only, totally-ordered log is not just a message bus — it is the single source of truth that eliminates N² integration pipelines and makes reprocessing routine. We studied his published writing and built a reusable streaming skill around the method.
What W. Edwards Deming's Plan-Do-Study-Act Taught Our Data Quality Agent
W. Edwards Deming spent a career arguing that quality comes from improving the process, not inspecting for defects. His Plan-Do-Study-Act cycle is the most rigorous improvement loop in the field. Here is how we encoded it into our data quality agent.