Hamza Tahir, co-founder of ZenML, joins the show to cut through the hype around long-running agents — arguing that at the end of the day, an agent is just a while loop that talks to a model, calls a tool, and writes to a file system. He covers the architecture of agent harnesses (inner and outer), what durable execution actually guarantees (and what it doesn't), and why the ML pipeline paradigm is a cleaner mental model than transactions for most agent workloads.
Hamza also announces Kitaru — ZenML's new open-source execution runtime for async Python agents — built on five years of running ML workloads in enterprise environments.
What we get into:
Agents are while loops: The surprising simplicity under all the tooling: a brain (LLM), hands (tool calls), and a file system, stacked recursively
Inner harness vs outer harness: Why Pydantic AI owns the inner loop while production deployment needs a separate runtime layer
What "long-running" actually means: Why the infrastructure we need to build is about extrapolating the future, not defining a time window today
Durable execution demystified: What checkpointing actually guarantees (infra failures, pod death, network drops) vs. what it never will (external state, bad LLM outputs, Snowflake rollbacks)
ML pipelines vs transactions: Why bursty containers in Kubernetes map more naturally to agent workloads than microsecond-latency queue workers — and why Hamza argues against the complexity tax
Anthropic opening the harness: Why letting other models run Claude Cowork is a "boss move," and what it means for the one-harness vs one-model debate
Human-in-the-loop, done right: The pod-kill-and-resume pattern, and why warm pools matter less when your agent runs for days
Kitaru: ZenML's new open source durable execution runtime: zero-config local, Kubernetes/SageMaker/Vertex in production, built on Pydantic AI integration
Arguing with Claude about Temporal: Hamza's story of spending hours getting an LLM to admit ZenML and Temporal solves the same problem
If you're architecting agents for production, picking between Pydantic AI, LangGraph, and Temporal, or just want to understand what "durable execution" actually means — this is the episode.
// LINKS & RESOURCES
Kitaru on GitHub: https://github.com/zenml-io/kitaru
Kitaru launch blog post: https://www.zenml.io/blog/kitaru-launch
Kitaru on Hacker News: https://news.ycombinator.com/item?id=47520115
Hamza Tahir on LinkedIn: https://www.linkedin.com/in/hamzatahirofficial/
ZenML: https://www.zenml.io/
Timestamps
[00:00] While Loop Checkpointing
[00:24] Long-Running Agents Explained
[01:28] Agent Harness Model Definitions
[06:30] Durability and State Recovery
[11:03] Agent Systems Layers
[18:45] Durability in Agent Systems
[22:07] ML Pipeline vs Transactions
[29:23] Durability vs Guarantees
[33:13] Durability vs Chaos Engineering
[39:50] Kitaru Naming and Purpose
[40:38] Wrap up
#AIAgents #DurableExecution #OpenSource