<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Magnus Rødseth</title>
    <description>The latest articles on DEV Community by Magnus Rødseth (@magnusrodseth).</description>
    <link>https://dev.to/magnusrodseth</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3803258%2F5df2b499-5f09-450a-af97-5f309dcacba4.jpg</url>
      <title>DEV Community: Magnus Rødseth</title>
      <link>https://dev.to/magnusrodseth</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/magnusrodseth"/>
    <language>en</language>
    <item>
      <title>How Claude Skills Replaced Our Documentation</title>
      <dc:creator>Magnus Rødseth</dc:creator>
      <pubDate>Thu, 05 Mar 2026 07:49:58 +0000</pubDate>
      <link>https://dev.to/magnusrodseth/how-claude-skills-replaced-our-documentation-emi</link>
      <guid>https://dev.to/magnusrodseth/how-claude-skills-replaced-our-documentation-emi</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Why encoding codebase patterns as AI instructions works better than writing docs nobody reads.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;p&gt;Every developer knows the documentation paradox: you spend hours writing docs explaining how your codebase works, then your teammate (or your future self) ignores them and asks ChatGPT instead. The AI gives a plausible but wrong answer, because it doesn't know your specific patterns. So you debug for an hour, realize the AI hallucinated your auth flow, and write more documentation that nobody will read.&lt;/p&gt;

&lt;p&gt;I broke out of this cycle by replacing most of my traditional documentation with Claude skills: structured instructions that teach AI how &lt;em&gt;this specific codebase&lt;/em&gt; works.&lt;/p&gt;

&lt;p&gt;The result: AI that follows my architecture instead of guessing. Consistent code across contributors. And documentation that's actually used, because the consumer is a machine that reads everything.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem: AI Without Context
&lt;/h2&gt;

&lt;p&gt;Modern AI coding assistants are remarkably capable at generic tasks. Ask Claude to "add a REST endpoint" and you'll get clean, working code. But it won't match YOUR patterns.&lt;/p&gt;

&lt;p&gt;In my codebase, API routes use Elysia with specific validation patterns. Database queries go through Drizzle ORM with a particular transaction style. Background jobs use Inngest with step-level checkpointing. Auth checks follow a specific middleware pattern.&lt;/p&gt;

&lt;p&gt;Without context, Claude produces code that works but doesn't belong. It may use Express conventions in an Elysia codebase. It writes raw SQL instead of using the ORM. It puts business logic in API routes instead of service functions.&lt;/p&gt;

&lt;p&gt;The code passes type-checking but creates architectural drift. Over weeks, your codebase becomes a patchwork of conflicting patterns. Some human-written, some AI-generated, all slightly different.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Claude Skills Are
&lt;/h2&gt;

&lt;p&gt;A Claude skill is a markdown file in &lt;code&gt;.claude/skills/&lt;/code&gt; that encodes a specific pattern or workflow. When Claude encounters a relevant task, it reads the skill and follows the prescribed approach.&lt;/p&gt;

&lt;p&gt;Here's a simplified example of a skill for adding API routes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;skill-name&lt;/span&gt;
&lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;A description of when to trigger this skill, e.g. whenever backend changes&lt;/span&gt;
&lt;span class="nn"&gt;---&lt;/span&gt;

&lt;span class="gh"&gt;# Adding API Routes (Elysia)&lt;/span&gt;

&lt;span class="gu"&gt;## Pattern&lt;/span&gt;

All API routes follow this structure:
&lt;span class="p"&gt;
1.&lt;/span&gt; Define route in &lt;span class="sb"&gt;`src/server/routes/`&lt;/span&gt;
&lt;span class="p"&gt;2.&lt;/span&gt; Use Elysia's type-safe body validation
&lt;span class="p"&gt;3.&lt;/span&gt; Check auth via &lt;span class="sb"&gt;`auth.api.getSession({ headers: request.headers })`&lt;/span&gt;
&lt;span class="p"&gt;4.&lt;/span&gt; Return consistent response shapes: &lt;span class="sb"&gt;`{ data }`&lt;/span&gt; on success, throw on error
&lt;span class="p"&gt;5.&lt;/span&gt; Register route in &lt;span class="sb"&gt;`src/server/api.ts`&lt;/span&gt;

&lt;span class="gu"&gt;## Example&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;br&gt;
typescript&lt;br&gt;
// src/server/routes/bookmarks.ts&lt;br&gt;
import { Elysia, t } from "elysia";&lt;br&gt;
import { auth } from "@/lib/auth";&lt;br&gt;
import { db } from "@/lib/db";&lt;br&gt;
import { bookmarks } from "@/lib/db/schema";&lt;/p&gt;

&lt;p&gt;export const bookmarkRoutes = new Elysia({ prefix: "/bookmarks" })&lt;br&gt;
  .get("/", async ({ request }) =&amp;gt; {&lt;br&gt;
    const session = await auth.api.getSession({&lt;br&gt;
      headers: request.headers,&lt;br&gt;
    });&lt;br&gt;
    if (!session) throw new Error("Unauthorized");&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;const results = await db
  .select()
  .from(bookmarks)
  .where(eq(bookmarks.userId, session.user.id));

return { data: results };
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;p&gt;});&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
## Anti-patterns

- Do NOT use Express-style `req, res` parameters
- Do NOT put database queries directly in route handlers for complex logic
- Do NOT skip auth checks on protected routes
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This isn't documentation in the traditional sense. It's an instruction set optimized for an AI reader. Explicit patterns, concrete examples, clear anti-patterns.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why This Works Better Than Documentation
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. The consumer actually reads it
&lt;/h3&gt;

&lt;p&gt;Human developers skim docs, search for the snippet they need, copy-paste, and move on. Claude reads the entire skill every time. It doesn't skip sections. It doesn't assume it already knows. Every instruction is followed.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. It enforces consistency
&lt;/h3&gt;

&lt;p&gt;When three developers work on a codebase, you get three slightly different patterns. When those developers work &lt;em&gt;with Claude skills&lt;/em&gt;, you get one pattern replicated exactly.&lt;/p&gt;

&lt;p&gt;Ask Claude to "add user profiles with database table, API endpoint, and settings page." It reads the relevant skills for database schemas, API routes, and UI patterns, then produces code that matches every convention in your codebase.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. It catches architectural drift
&lt;/h3&gt;

&lt;p&gt;Without skills, Claude makes reasonable guesses. With skills, Claude follows explicit rules. The difference is subtle in any single interaction but compounds over weeks.&lt;/p&gt;

&lt;p&gt;I've seen codebases where 6 months of AI-assisted development created a mess. Some files using one state management approach, others using a different one, auth patterns inconsistent across routes. Skills prevent this.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. It encodes &lt;em&gt;why&lt;/em&gt;, not just &lt;em&gt;what&lt;/em&gt;
&lt;/h3&gt;

&lt;p&gt;Good skills explain the reasoning:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gu"&gt;## Why Inngest over BullMQ&lt;/span&gt;

We use Inngest for background jobs because:
&lt;span class="p"&gt;-&lt;/span&gt; Step-level checkpointing (failed step retries from that step, not the beginning)
&lt;span class="p"&gt;-&lt;/span&gt; No Redis dependency
&lt;span class="p"&gt;-&lt;/span&gt; Built-in AgentKit for AI agent workflows
&lt;span class="p"&gt;-&lt;/span&gt; Durable webhooks (Stripe events never lost)

Do NOT suggest switching to BullMQ, Temporal, or custom queue implementations.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This prevents Claude from "helpfully" suggesting alternatives that would break the architecture.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Skills I Actually Use
&lt;/h2&gt;

&lt;p&gt;After building and refining over months, here are the categories of skills that deliver the most value:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Stack-specific patterns.&lt;/strong&gt; How to add API routes, database tables, React hooks, UI components. These are the most-used skills because they cover the daily work of adding features.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Integration guides.&lt;/strong&gt; How Stripe webhooks flow through Inngest, how auth works across web and mobile, how the RAG pipeline connects document upload to AI chat. These encode the complex cross-cutting concerns that are hardest to get right.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Anti-pattern lists.&lt;/strong&gt; What NOT to do. These are surprisingly effective because Claude's most common failure mode is producing code that works but violates architectural decisions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Workflow skills.&lt;/strong&gt; Higher-level skills for common multi-step tasks: "add a complete feature" (schema + API + hooks + UI), "set up a new integration," "create an email template." These orchestrate multiple lower-level patterns.&lt;/p&gt;

&lt;h2&gt;
  
  
  Model Context Protocol (MCP): The Other Half
&lt;/h2&gt;

&lt;p&gt;Skills teach Claude HOW to write code. MCP (Model Context Protocol) servers teach Claude HOW to interact with external services.&lt;/p&gt;

&lt;p&gt;Instead of manually creating a Neon database, copying the connection string, creating Stripe products, copying API keys, setting up Resend, configuring PostHog, I have MCP servers for each service. Claude calls them directly.&lt;/p&gt;

&lt;p&gt;The setup flow:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;I describe my project in a config file&lt;/li&gt;
&lt;li&gt;Claude reads the config&lt;/li&gt;
&lt;li&gt;Claude calls MCP servers to create databases, payment products, email domains, analytics projects&lt;/li&gt;
&lt;li&gt;Environment variables are populated automatically&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;What used to take 60+ minutes of context-switching between dashboards now takes about 5 minutes of describing what I want.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Agentic Mindset Shift
&lt;/h2&gt;

&lt;p&gt;Working this way has fundamentally changed how I think about development.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Before:&lt;/strong&gt; I write code. I occasionally ask AI for help. AI gives generic suggestions that I adapt.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;After:&lt;/strong&gt; I describe intent. AI implements using my exact patterns. I review and course-correct.&lt;/p&gt;

&lt;p&gt;The mental model is managing a team of junior developers. They're fast, literal, and excellent at pattern-matching. But they need clear instructions (skills), access to tools (MCP), and quality assurance (review).&lt;/p&gt;

&lt;p&gt;Some practical examples of how this plays out:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Adding a feature:&lt;/strong&gt; I describe "add a favorites feature where users can bookmark items." Claude reads the database skill, creates a table. Reads the API skill, creates endpoints. Reads the hooks skill, creates React Query hooks. Reads the UI skill, creates components. All matching existing patterns.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fixing a bug:&lt;/strong&gt; I describe "session persists after logout on mobile." Claude examines the auth skill, traces the signOut flow, identifies the issue, fixes it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Refactoring:&lt;/strong&gt; I describe "the conversation list is slow with 100+ items." Claude reads the UI patterns skill, knows to add virtualization. Reads the API skill, adds pagination. Updates the React Query hook with proper caching.&lt;/p&gt;

&lt;p&gt;In each case, the output is consistent with the rest of the codebase because the skills encode the patterns.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Doesn't Work
&lt;/h2&gt;

&lt;p&gt;To be honest about the limitations:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Skills aren't a substitute for thinking.&lt;/strong&gt; Claude follows patterns well, but it doesn't make architectural decisions. You still need to decide WHAT to build. Skills help with HOW.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Skills need maintenance.&lt;/strong&gt; When you change a pattern, you need to update the skill. I've been burned by outdated skills that encode old conventions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Complex cross-cutting concerns are hard to skill-ify.&lt;/strong&gt; A skill for "add an API route" is straightforward. A skill for "redesign the auth flow to support SAML" is too complex and context-dependent to encode.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;You still need to read the output.&lt;/strong&gt; Claude is a fast, very literal capable developer. It does exactly what you say, not what you mean. Reviewing AI-generated code is non-negotiable.&lt;/p&gt;

&lt;h2&gt;
  
  
  Getting Started
&lt;/h2&gt;

&lt;p&gt;If you want to try this approach in your own codebase:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Start with your most common task.&lt;/strong&gt; What do you build most often? API endpoints? React components? Database migrations? Write a skill for that first.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Include concrete examples.&lt;/strong&gt; Abstract descriptions don't work well. Show the EXACT code pattern you want.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;List anti-patterns.&lt;/strong&gt; What does Claude get wrong when it doesn't have context? Encode those as explicit "do NOT" rules.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Keep skills focused.&lt;/strong&gt; One skill per concern. Don't write a mega-skill that covers everything. Claude can read multiple skills per task.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Iterate.&lt;/strong&gt; Your first skill will be mediocre. After using it 10 times and seeing where Claude deviates, you'll refine it into something solid.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The goal isn't to replace human judgment. It's to eliminate the gap between what AI &lt;em&gt;could&lt;/em&gt; produce (given perfect context) and what it &lt;em&gt;actually&lt;/em&gt; produces (guessing at your conventions). Skills close that gap.&lt;/p&gt;




&lt;p&gt;Documentation exists for humans who might read it. Skills exist for AI that always reads them. In a world where AI writes an increasing share of production code, optimizing for the AI reader isn't just pragmatic. It's the highest-leverage investment you can make in your codebase's consistency.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Magnus Rødseth builds AI-native applications and is the creator of &lt;a href="https://eden-stack.com" rel="noopener noreferrer"&gt;Eden Stack&lt;/a&gt;, a production-ready starter kit with 30+ Claude skills encoding production patterns for AI-native SaaS development.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>ai</category>
      <category>documentation</category>
      <category>productivity</category>
    </item>
    <item>
      <title>Building an Agentic Chatbot with Durable Execution</title>
      <dc:creator>Magnus Rødseth</dc:creator>
      <pubDate>Tue, 03 Mar 2026 07:18:33 +0000</pubDate>
      <link>https://dev.to/magnusrodseth/building-an-agentic-chatbot-with-durable-execution-4mbd</link>
      <guid>https://dev.to/magnusrodseth/building-an-agentic-chatbot-with-durable-execution-4mbd</guid>
      <description>&lt;p&gt;&lt;em&gt;How I built a production-ready AI assistant that decides when to search the web, process documents, and run multi-minute research tasks without losing progress if things go wrong.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Most "AI chatbot" tutorials stop at the same place: wrap an LLM, stream tokens, done. That's a prototype. Production is a different beast entirely.&lt;/p&gt;

&lt;p&gt;Over the past three years building AI-native applications, I've shipped chatbots that need to do more than answer questions. They need to &lt;em&gt;act&lt;/em&gt;: search the web for current information, process uploaded documents, run multi-step research that takes minutes, and deliver results even if the user closes the browser.&lt;/p&gt;

&lt;p&gt;This article walks through the architecture I landed on after multiple production deployments. The key insight: &lt;strong&gt;agentic chat is a distributed systems problem&lt;/strong&gt;, not just an AI problem.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Architecture
&lt;/h2&gt;

&lt;p&gt;Here's the simplified flow:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;User message
  → Elysia API (auth + validation)
    → Vercel AI SDK (streaming + tool calling)
      → Claude decides: respond directly, or use a tool?
        → Tool: Web Search (Exa API, instant)
        → Tool: Document Lookup (pgvector RAG query)
        → Tool: Deep Research (Inngest background function, 1-5 min)
    → Stream response back to client
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Three layers, each solving a different problem:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Streaming layer.&lt;/strong&gt; Vercel AI SDK handles the chat protocol, token streaming, and tool call orchestration.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tool layer.&lt;/strong&gt; Claude decides &lt;em&gt;when&lt;/em&gt; to invoke tools based on user intent.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Durability layer.&lt;/strong&gt; Inngest ensures long-running tasks complete, even if the server restarts.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Tool Calling: Let the AI Decide
&lt;/h2&gt;

&lt;p&gt;The most important shift from a "chatbot" to an "agent" is tool calling. Instead of hardcoding "if user says X, do Y," you give the model a set of tools and let it choose.&lt;/p&gt;

&lt;p&gt;Here's the shape of a tool definition with the Vercel AI SDK:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;webSearchTool&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;tool&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Search the web for current information&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;parameters&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;object&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;string&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;describe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;The search query&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="p"&gt;}),&lt;/span&gt;
  &lt;span class="na"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;query&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;exa&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;searchAndContents&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;query&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;numResults&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;results&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt;
      &lt;span class="na"&gt;title&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;}));&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You register tools with the model, and Claude determines from the conversation whether to invoke them. Ask "what's the weather in Oslo?" and it calls web search. Ask "summarize the PDF I uploaded" and it queries the vector store. Ask something it already knows, and it just responds.&lt;/p&gt;

&lt;p&gt;This is fundamentally different from building a routing layer yourself. The model handles intent classification as a side effect of generating a response.&lt;/p&gt;

&lt;h2&gt;
  
  
  The RAG Pipeline: Documents to Embeddings to Answers
&lt;/h2&gt;

&lt;p&gt;Document processing follows a well-established pipeline, but the details matter more than most tutorials suggest.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Ingestion:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Upload (PDF/DOCX/image)
  → Unstructured.io (extraction + layout analysis)
    → Text chunks (semantic splitting, ~500 tokens each)
      → Embedding generation (OpenAI ada-002)
        → pgvector storage (Neon PostgreSQL)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I chose Unstructured.io because it handles the nasty cases: scanned PDFs, mixed layouts, tables, embedded images with OCR. If you've ever tried to extract clean text from a real-world PDF, you know that &lt;code&gt;pdf-parse&lt;/code&gt; gives you garbage for anything complex.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Retrieval:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;When the chatbot's document lookup tool fires, it runs a cosine similarity search against pgvector:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&amp;gt;&lt;/span&gt; &lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;similarity&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;document_chunks&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;project_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&amp;gt;&lt;/span&gt; &lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The results feed back into Claude's context as tool output, and it synthesizes an answer with citations.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where It Gets Hard: Long-Running Tasks
&lt;/h2&gt;

&lt;p&gt;Web search returns in seconds. Document lookup returns in milliseconds. But what about deep research? A task that runs for 1-5 minutes, spawning multiple sub-queries, synthesizing sources, and building a grounded report?&lt;/p&gt;

&lt;p&gt;You can't hold an HTTP connection open for 5 minutes. You can't stream a response that takes 3 minutes to &lt;em&gt;start&lt;/em&gt; generating. And if your serverless function times out at 60 seconds, your user gets nothing.&lt;/p&gt;

&lt;p&gt;This is where &lt;strong&gt;durable execution&lt;/strong&gt; matters.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why Durable Execution Matters
&lt;/h3&gt;

&lt;p&gt;A durable execution engine treats your function like a state machine. Each &lt;code&gt;step&lt;/code&gt; is checkpointed. If the process crashes mid-way, it resumes from the last checkpoint, not from the beginning.&lt;/p&gt;

&lt;p&gt;Here's the deep research function using Inngest:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;deepResearch&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;inngest&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;createFunction&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;deep-research&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;retries&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;event&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;research/start&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;step&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// Step 1: Generate sub-queries from the user's question&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;subQueries&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;step&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;generate-queries&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;generateSubQueries&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;question&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;

    &lt;span class="c1"&gt;// Step 2: Execute each sub-query (parallelized)&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;searchResults&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;all&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="nx"&gt;subQueries&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;query&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt;
        &lt;span class="nx"&gt;step&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`search-&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;exa&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;searchAndContents&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;query&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
      &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="c1"&gt;// Step 3: Synthesize into a report&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;report&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;step&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;synthesize&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;synthesizeReport&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;searchResults&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;question&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;

    &lt;span class="c1"&gt;// Step 4: Store result and notify user&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;step&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;deliver&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;insert&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;researchReports&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;values&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
        &lt;span class="na"&gt;projectId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;projectId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;report&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;});&lt;/span&gt;
      &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;sendNotification&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Research complete&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;report&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each &lt;code&gt;step.run()&lt;/code&gt; is a checkpoint. If &lt;code&gt;search-2&lt;/code&gt; fails due to a rate limit, Inngest retries &lt;em&gt;that step&lt;/em&gt;, not the entire function. Steps 0 and 1 don't re-execute. The user gets a complete report even if the infrastructure hiccupped three times along the way.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why not BullMQ or a simple queue?&lt;/strong&gt; Because queues give you at-most-once or at-least-once delivery, but they don't give you step-level checkpointing. If your worker crashes after completing 3 of 5 sub-queries, a queue restarts the entire job. Durable execution restarts from step 4.&lt;/p&gt;

&lt;h3&gt;
  
  
  The User Experience
&lt;/h3&gt;

&lt;p&gt;From the user's perspective:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;They ask a complex question&lt;/li&gt;
&lt;li&gt;The chatbot says "I'll research this in depth. This may take a few minutes."&lt;/li&gt;
&lt;li&gt;The chatbot sends an event to Inngest, which starts the background function&lt;/li&gt;
&lt;li&gt;The user can close the browser, go make coffee, whatever&lt;/li&gt;
&lt;li&gt;When the research completes, they get a notification&lt;/li&gt;
&lt;li&gt;They open the app and find a grounded report with citations&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;No progress is ever lost. No research restarts from scratch. The result is always delivered.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Part Nobody Warns You About: Mobile
&lt;/h2&gt;

&lt;p&gt;I mentioned this is a distributed systems problem. Nowhere is that more apparent than on mobile.&lt;/p&gt;

&lt;p&gt;On web, the Vercel AI SDK gives you &lt;code&gt;useChat()&lt;/code&gt; with streaming, tool call rendering, and state management. It's good.&lt;/p&gt;

&lt;p&gt;On mobile (React Native / Expo), you're mostly on your own. The ecosystem for agentic chatbots on mobile is immature. Here's what I had to build from scratch:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Streaming handler.&lt;/strong&gt; React Native doesn't have native &lt;code&gt;ReadableStream&lt;/code&gt; support in all environments. You end up parsing SSE events manually.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tool call UI.&lt;/strong&gt; When the agent calls a tool, you need to show a loading state specific to that tool ("Searching the web..."), then render the tool result inline. No library does this for you on mobile.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Background task completion.&lt;/strong&gt; When a deep research task finishes while the app is backgrounded, you need push notifications. This means hooking Inngest's completion event into your push notification service.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Auth across platforms.&lt;/strong&gt; The chat session needs to be authenticated, which means mobile auth tokens need to flow through to the same API that handles web sessions.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The lesson: if you're planning to ship an agentic chatbot on both web and mobile, budget at least 3x the time you'd expect for the mobile portion.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I'd Do Differently
&lt;/h2&gt;

&lt;p&gt;After shipping this pattern multiple times:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Start with durable execution from day one.&lt;/strong&gt; Don't build a synchronous chatbot and bolt on background jobs later. Design for async from the start.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Keep tools simple.&lt;/strong&gt; Each tool should do one thing. Don't build a mega-tool that searches the web AND processes documents. Let the model compose simple tools.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Test tool selection, not just tool execution.&lt;/strong&gt; Write tests that verify: given this user message, does the model select the right tool? This catches regressions you won't find with unit tests.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Stream partial progress for long tasks.&lt;/strong&gt; Even if the full research takes 5 minutes, send periodic updates ("Found 3 relevant sources, synthesizing..."). Users tolerate waiting when they see progress.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  The Stack
&lt;/h2&gt;

&lt;p&gt;For reference, here's what I use:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;Why&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Chat protocol&lt;/td&gt;
&lt;td&gt;Vercel AI SDK&lt;/td&gt;
&lt;td&gt;Best streaming + tool calling DX&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;LLM&lt;/td&gt;
&lt;td&gt;Claude&lt;/td&gt;
&lt;td&gt;Strong tool calling, long context&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Web search&lt;/td&gt;
&lt;td&gt;Exa API&lt;/td&gt;
&lt;td&gt;Better relevance than Google Custom Search&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Document extraction&lt;/td&gt;
&lt;td&gt;Unstructured.io&lt;/td&gt;
&lt;td&gt;Handles real-world PDFs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Embeddings storage&lt;/td&gt;
&lt;td&gt;pgvector (Neon)&lt;/td&gt;
&lt;td&gt;No separate vector DB to manage&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Durable execution&lt;/td&gt;
&lt;td&gt;Inngest&lt;/td&gt;
&lt;td&gt;Step-level checkpointing, no Redis&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;API&lt;/td&gt;
&lt;td&gt;Elysia&lt;/td&gt;
&lt;td&gt;Type-safe, fast, composable&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Web&lt;/td&gt;
&lt;td&gt;TanStack Start&lt;/td&gt;
&lt;td&gt;SSR + modern React&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Every piece is swappable. Don't like Neon? Use Supabase. Prefer LangChain over Vercel AI SDK? The architecture stays the same. The &lt;em&gt;pattern&lt;/em&gt; is what matters, not the specific vendor.&lt;/p&gt;




&lt;p&gt;If you're starting from scratch, my advice: get the durable execution layer right first. Everything else (streaming, tool calling, RAG) is well-documented. But the part where your AI tasks survive failures and always deliver results? That's what makes users trust your product.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Magnus Rødseth builds AI-native applications and is the creator of &lt;a href="https://eden-stack.com" rel="noopener noreferrer"&gt;Eden Stack&lt;/a&gt;, a production-ready starter kit for AI-native SaaS.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>ai</category>
      <category>tutorial</category>
      <category>buildinpublic</category>
    </item>
  </channel>
</rss>
