<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[The MLnotes Newsletter]]></title><description><![CDATA[MLnotes shares bite-sized insights on AI, ML, GenAI, agents, and RAG—from real-world applications to careers and startups—helping cut through the noise of rapid AI progress.]]></description><link>https://mlnotes.substack.com</link><image><url>https://substackcdn.com/image/fetch/$s_!Kq64!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F88ea2641-8482-4fd4-95b6-f0d56d807c5c_1280x1280.png</url><title>The MLnotes Newsletter</title><link>https://mlnotes.substack.com</link></image><generator>Substack</generator><lastBuildDate>Sun, 28 Jun 2026 18:51:21 GMT</lastBuildDate><atom:link href="https://mlnotes.substack.com/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[MLnotes]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[mlnotes@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[mlnotes@substack.com]]></itunes:email><itunes:name><![CDATA[Mehdi Allahyari]]></itunes:name></itunes:owner><itunes:author><![CDATA[Mehdi Allahyari]]></itunes:author><googleplay:owner><![CDATA[mlnotes@substack.com]]></googleplay:owner><googleplay:email><![CDATA[mlnotes@substack.com]]></googleplay:email><googleplay:author><![CDATA[Mehdi Allahyari]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[Agentic RAG: Make Retrieval a Decision, Not a Step]]></title><description><![CDATA[Here&#8217;s the uncomfortable truth almost every RAG tutorial skips: the standard RAG pipeline doesn&#8217;t actually work beyond the demo.]]></description><link>https://mlnotes.substack.com/p/agentic-rag-make-retrieval-a-decision</link><guid isPermaLink="false">https://mlnotes.substack.com/p/agentic-rag-make-retrieval-a-decision</guid><dc:creator><![CDATA[Mehdi Allahyari]]></dc:creator><pubDate>Sat, 27 Jun 2026 14:00:58 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!sZOr!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6cdd16b-67e3-4044-a548-bc0ecff161ed_1672x941.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!sZOr!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6cdd16b-67e3-4044-a548-bc0ecff161ed_1672x941.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!sZOr!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6cdd16b-67e3-4044-a548-bc0ecff161ed_1672x941.png 424w, https://substackcdn.com/image/fetch/$s_!sZOr!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6cdd16b-67e3-4044-a548-bc0ecff161ed_1672x941.png 848w, https://substackcdn.com/image/fetch/$s_!sZOr!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6cdd16b-67e3-4044-a548-bc0ecff161ed_1672x941.png 1272w, https://substackcdn.com/image/fetch/$s_!sZOr!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6cdd16b-67e3-4044-a548-bc0ecff161ed_1672x941.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!sZOr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6cdd16b-67e3-4044-a548-bc0ecff161ed_1672x941.png" width="1456" height="819" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f6cdd16b-67e3-4044-a548-bc0ecff161ed_1672x941.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2100690,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://mlnotes.substack.com/i/203402634?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6cdd16b-67e3-4044-a548-bc0ecff161ed_1672x941.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!sZOr!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6cdd16b-67e3-4044-a548-bc0ecff161ed_1672x941.png 424w, https://substackcdn.com/image/fetch/$s_!sZOr!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6cdd16b-67e3-4044-a548-bc0ecff161ed_1672x941.png 848w, https://substackcdn.com/image/fetch/$s_!sZOr!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6cdd16b-67e3-4044-a548-bc0ecff161ed_1672x941.png 1272w, https://substackcdn.com/image/fetch/$s_!sZOr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6cdd16b-67e3-4044-a548-bc0ecff161ed_1672x941.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Here&#8217;s the uncomfortable truth almost every RAG tutorial skips: <strong>the standard RAG pipeline doesn&#8217;t actually work</strong> beyond the demo.</p><p>You know the recipe, embed your documents, retrieve the top-k chunks for a query, stuff them into a prompt, hope for the best. It looks magical on a curated example. Then you put it in front of real users and it falls apart: someone asks a two-part question and the single retrieval grabs context for half of it and makes up the rest. Someone uses an exact term your embedding model doesn&#8217;t understand and gets nothing. Someone asks something the docs don&#8217;t cover and the model confidently invents an answer with a citation that doesn&#8217;t exist.</p><p>These aren&#8217;t bugs you can tune away. They&#8217;re <strong>structural</strong>. Naive RAG retrieves <em>once</em>, <em>blindly</em>, <em>before the model has thought about anything</em>, and then it&#8217;s stuck with whatever it got. The system never gets to decide it should search again, search differently, read more, or admit it doesn&#8217;t know.</p><p>The fix isn&#8217;t a better embedding model or a bigger context window. It&#8217;s a different architecture: <strong>agentic RAG</strong>, where retrieval itself becomes <em>agentic</em>. The model decides when to search, rewrites its own queries, runs as many retrieval rounds as the question needs, pulls more context only where it&#8217;s thin, and, the part I care about most, <strong>refuses to cite a source it didn&#8217;t actually retrieve.</strong></p><p>This post builds one end to end so you can see exactly why the agentic version wins. The project runs on <a href="https://github.com/BerriAI/litellm">LiteLLM</a> (multi-provider, searching for a replacement though ;) ), LanceDB (vectors), full text search (BM25), and a local embedding + reranking stack. Let&#8217;s build it up piece by piece.</p><div><hr></div><h2>The shape of the thing</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!7LIO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa945c438-961c-4b1c-9d1a-7ee678db5a93_1672x941.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!7LIO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa945c438-961c-4b1c-9d1a-7ee678db5a93_1672x941.png 424w, https://substackcdn.com/image/fetch/$s_!7LIO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa945c438-961c-4b1c-9d1a-7ee678db5a93_1672x941.png 848w, https://substackcdn.com/image/fetch/$s_!7LIO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa945c438-961c-4b1c-9d1a-7ee678db5a93_1672x941.png 1272w, https://substackcdn.com/image/fetch/$s_!7LIO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa945c438-961c-4b1c-9d1a-7ee678db5a93_1672x941.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!7LIO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa945c438-961c-4b1c-9d1a-7ee678db5a93_1672x941.png" width="1456" height="819" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a945c438-961c-4b1c-9d1a-7ee678db5a93_1672x941.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:954657,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://mlnotes.substack.com/i/203402634?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa945c438-961c-4b1c-9d1a-7ee678db5a93_1672x941.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!7LIO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa945c438-961c-4b1c-9d1a-7ee678db5a93_1672x941.png 424w, https://substackcdn.com/image/fetch/$s_!7LIO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa945c438-961c-4b1c-9d1a-7ee678db5a93_1672x941.png 848w, https://substackcdn.com/image/fetch/$s_!7LIO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa945c438-961c-4b1c-9d1a-7ee678db5a93_1672x941.png 1272w, https://substackcdn.com/image/fetch/$s_!7LIO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa945c438-961c-4b1c-9d1a-7ee678db5a93_1672x941.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Before any code, here&#8217;s the mental model. There are two pipelines: an <strong>offline </strong>one that gets documents into searchable shape, and an <strong>online</strong> one that answers questions.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!clr3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ac2b3fd-98cc-4b70-9c43-ec59c641f55c_988x2091.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!clr3!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ac2b3fd-98cc-4b70-9c43-ec59c641f55c_988x2091.png 424w, https://substackcdn.com/image/fetch/$s_!clr3!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ac2b3fd-98cc-4b70-9c43-ec59c641f55c_988x2091.png 848w, https://substackcdn.com/image/fetch/$s_!clr3!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ac2b3fd-98cc-4b70-9c43-ec59c641f55c_988x2091.png 1272w, https://substackcdn.com/image/fetch/$s_!clr3!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ac2b3fd-98cc-4b70-9c43-ec59c641f55c_988x2091.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!clr3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ac2b3fd-98cc-4b70-9c43-ec59c641f55c_988x2091.png" width="988" height="2091" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8ac2b3fd-98cc-4b70-9c43-ec59c641f55c_988x2091.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:2091,&quot;width&quot;:988,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:136186,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://mlnotes.substack.com/i/203402634?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ac2b3fd-98cc-4b70-9c43-ec59c641f55c_988x2091.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!clr3!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ac2b3fd-98cc-4b70-9c43-ec59c641f55c_988x2091.png 424w, https://substackcdn.com/image/fetch/$s_!clr3!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ac2b3fd-98cc-4b70-9c43-ec59c641f55c_988x2091.png 848w, https://substackcdn.com/image/fetch/$s_!clr3!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ac2b3fd-98cc-4b70-9c43-ec59c641f55c_988x2091.png 1272w, https://substackcdn.com/image/fetch/$s_!clr3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ac2b3fd-98cc-4b70-9c43-ec59c641f55c_988x2091.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The dashed lines show where the two halves meet: the indexes built offline are what the agent&#8217;s <code>execute tool</code> step searches at question time. And the loop label matters, the model can cycle <code>LLM &#8594; tool &#8594; result &#8594; LLM</code> up to 10 times before it commits to an answer.</p><p>The key difference lives in that loop on the right. <strong>Naive RAG is a straight line: retrieve &#8594; generate, once, and you live with it.</strong> Agentic RAG is a cycle the model drives, and that single architectural change is what turns retrieval from a blind guess into a deliberate search.</p><div><hr></div><h2>Part 1, Ingestion: getting documents in</h2><p>Nothing exotic here, but two decisions matter.</p><p><strong>Chunk with overlap, and give every chunk a positional ID.</strong> I split text into 512-character chunks with 64 characters of overlap, and ID them sequentially:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;0df46ef2-4d54-4091-a96e-13e747e061f1&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">def _chunk_id(doc_id: str, index: int) -&gt; str:
    return f"{doc_id}__c{index:04d}"   # paper__c0000, paper__c0001, ...</code></pre></div><p>That <code>c0000</code>, <code>c0001</code> ordering isn&#8217;t cosmetic, it&#8217;s what lets the agent later say &#8220;give me the chunks <em>around</em> this one&#8221; to read more of a document. The ID encodes position.</p><p><strong>Embed locally, and build two indexes.</strong> Each chunk is embedded with <code>BAAI/bge-small-en-v1.5</code> (a small, fast, 384-dim model that runs on your machine, no embedding API bill), stored in LanceDB, and <em>also</em> indexed by BM25:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;ef374c6b-a31c-46b4-8983-d50a22dac854&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">vectors = embed(texts, "BAAI/bge-small-en-v1.5")   # normalized, local
insert_chunks(db, chunks)                            # &#8594; LanceDB (semantic)
rebuild_index(collection, all_ids, all_texts)        # &#8594; BM25  (keyword)</code></pre></div><p>Why two indexes? Because semantic search and keyword search fail in opposite ways, and we&#8217;re about to use that. <strong>Notice: </strong>LanceDB also has full text search, but I decided to implement it separately. </p><div><hr></div><h2>Part 2, Retrieval that doesn&#8217;t suck: hybrid search</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!GUuH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16e34eeb-bb4f-4b93-a3c5-1d893f064a87_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!GUuH!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16e34eeb-bb4f-4b93-a3c5-1d893f064a87_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!GUuH!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16e34eeb-bb4f-4b93-a3c5-1d893f064a87_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!GUuH!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16e34eeb-bb4f-4b93-a3c5-1d893f064a87_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!GUuH!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16e34eeb-bb4f-4b93-a3c5-1d893f064a87_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!GUuH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16e34eeb-bb4f-4b93-a3c5-1d893f064a87_1536x1024.png" width="1456" height="971" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/16e34eeb-bb4f-4b93-a3c5-1d893f064a87_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2424564,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://mlnotes.substack.com/i/203402634?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16e34eeb-bb4f-4b93-a3c5-1d893f064a87_1536x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!GUuH!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16e34eeb-bb4f-4b93-a3c5-1d893f064a87_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!GUuH!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16e34eeb-bb4f-4b93-a3c5-1d893f064a87_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!GUuH!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16e34eeb-bb4f-4b93-a3c5-1d893f064a87_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!GUuH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16e34eeb-bb4f-4b93-a3c5-1d893f064a87_1536x1024.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Pure vector search is great at <em>meaning</em> (&#8221;car&#8221; matches &#8220;automobile&#8221;) and terrible at <em>specifics</em>, exact names, error codes, acronyms like &#8220;GNN-RAG&#8221; that don&#8217;t sit near anything in embedding space. Keyword search (BM25) is the mirror image: nails the exact term, misses the paraphrase.</p><p>The fix is to run both and merge the rankings. The merge trick is <strong>Reciprocal Rank Fusion (RRF)</strong>, and it&#8217;s beautifully simple. You don&#8217;t try to reconcile the two scoring systems (cosine similarity vs. BM25 scores aren&#8217;t comparable). You throw the scores away and only use <em>rank position</em>:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;63764cc2-33bf-4d77-90ec-7ca442e478a6&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">def _rrf(semantic, keyword, k: int = 60):
    """Reciprocal Rank Fusion over two ranked lists, keyed by chunk_id."""
    scores = {}
    for rank, r in enumerate(semantic):
        scores[r.chunk_id] = scores.get(r.chunk_id, 0.0) + 1.0 / (k + rank + 1)
    for rank, r in enumerate(keyword):
        scores[r.chunk_id] = scores.get(r.chunk_id, 0.0) + 1.0 / (k + rank + 1)
    return scores</code></pre></div><p>A chunk that ranks #1 in <em>both</em> lists rises to the top. A chunk that&#8217;s #1 in one and absent from the other still scores respectably. The <code>k=60</code> constant dampens the contribution of low-ranked items, it&#8217;s the standard value from the original paper and you rarely need to touch it.</p><p>Then one more pass: a <strong>cross-encoder reranker </strong>(<code>ms-marco-MiniLM</code>). The bi-encoder embeddings we used for retrieval encode the query and document <em>separately</em>, fast, but it can&#8217;t model how they relate. A cross-encoder reads the query and a candidate chunk <em>together</em> and scores the pair directly. It&#8217;s too slow to run over the whole corpus, but perfect for re-ranking the ~10 candidates RRF handed us down to the best 5.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!sUTE!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fafc0704f-eed3-4763-84d3-7df35052e0ba_1861x444.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!sUTE!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fafc0704f-eed3-4763-84d3-7df35052e0ba_1861x444.png 424w, https://substackcdn.com/image/fetch/$s_!sUTE!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fafc0704f-eed3-4763-84d3-7df35052e0ba_1861x444.png 848w, https://substackcdn.com/image/fetch/$s_!sUTE!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fafc0704f-eed3-4763-84d3-7df35052e0ba_1861x444.png 1272w, https://substackcdn.com/image/fetch/$s_!sUTE!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fafc0704f-eed3-4763-84d3-7df35052e0ba_1861x444.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!sUTE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fafc0704f-eed3-4763-84d3-7df35052e0ba_1861x444.png" width="1456" height="347" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/afc0704f-eed3-4763-84d3-7df35052e0ba_1861x444.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:347,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:54992,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://mlnotes.substack.com/i/203402634?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fafc0704f-eed3-4763-84d3-7df35052e0ba_1861x444.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!sUTE!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fafc0704f-eed3-4763-84d3-7df35052e0ba_1861x444.png 424w, https://substackcdn.com/image/fetch/$s_!sUTE!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fafc0704f-eed3-4763-84d3-7df35052e0ba_1861x444.png 848w, https://substackcdn.com/image/fetch/$s_!sUTE!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fafc0704f-eed3-4763-84d3-7df35052e0ba_1861x444.png 1272w, https://substackcdn.com/image/fetch/$s_!sUTE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fafc0704f-eed3-4763-84d3-7df35052e0ba_1861x444.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>This whole stack, embed, search, fuse, rerank, is one tool call from the agent&#8217;s point of view. Which brings us to the interesting part.</p><div><hr></div><h2>Part 3, The agent loop</h2><p>Here&#8217;s the heart of the system. Instead of retrieving once, the agent runs a <strong>ReAct loop</strong>: reason, act, observe, repeat. Stripped down, it&#8217;s this:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;da1bb428-5260-4763-a017-49cd2bcf4c43&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">for iteration in range(MAX_ITERATIONS):          # cap at 10
    stream = await stream_completion(messages, tools, ...)

    # ... collect the model's response: either text or tool calls ...

    if finish_reason == "stop":                  # model wrote an answer
        correction = validate_citation(answer, retrieved_ids)
        if correction:                           # caught a bad citation
            messages.append(correction)          # tell it to fix, loop again
            continue
        return answer                            # clean &#8212; we're done

    if tool_calls:                               # model wants to use a tool
        for call in tool_calls:
            result = dispatch_tool(call.name, call.args)
            messages.append(result)     </code></pre></div><p>That&#8217;s the entire control flow. Every iteration, the model looks at the conversation so far, including the results of any searches it already ran, and decides the next move. Search again with a better query? Pull more context around a promising hit? Or does it finally have enough to answer?</p><p>This is what &#8220;agentic&#8221; actually means in practice. Not a personality, not magic, just <strong>a loop where the model chooses the next action and sees the consequences before choosing again.</strong> Retrieval stops being a preprocessing step and becomes a <em>decision the model makes, repeatedly, with feedback.</em> That&#8217;s agentic retrieval, and it&#8217;s the whole ballgame.</p><p>A real trace for <em>&#8220;What is GNN-LLM and what is query rewriting?&#8221;</em> looks like:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ypPs!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb06fd257-52d1-4640-9fc2-5b3bcef193ce_2652x1438.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ypPs!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb06fd257-52d1-4640-9fc2-5b3bcef193ce_2652x1438.png 424w, https://substackcdn.com/image/fetch/$s_!ypPs!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb06fd257-52d1-4640-9fc2-5b3bcef193ce_2652x1438.png 848w, https://substackcdn.com/image/fetch/$s_!ypPs!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb06fd257-52d1-4640-9fc2-5b3bcef193ce_2652x1438.png 1272w, https://substackcdn.com/image/fetch/$s_!ypPs!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb06fd257-52d1-4640-9fc2-5b3bcef193ce_2652x1438.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ypPs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb06fd257-52d1-4640-9fc2-5b3bcef193ce_2652x1438.png" width="1456" height="789" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b06fd257-52d1-4640-9fc2-5b3bcef193ce_2652x1438.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:789,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:187079,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://mlnotes.substack.com/i/203402634?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb06fd257-52d1-4640-9fc2-5b3bcef193ce_2652x1438.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ypPs!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb06fd257-52d1-4640-9fc2-5b3bcef193ce_2652x1438.png 424w, https://substackcdn.com/image/fetch/$s_!ypPs!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb06fd257-52d1-4640-9fc2-5b3bcef193ce_2652x1438.png 848w, https://substackcdn.com/image/fetch/$s_!ypPs!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb06fd257-52d1-4640-9fc2-5b3bcef193ce_2652x1438.png 1272w, https://substackcdn.com/image/fetch/$s_!ypPs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb06fd257-52d1-4640-9fc2-5b3bcef193ce_2652x1438.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Two searches, not one, because it&#8217;s a two-part question. Then a third call to read <em>around</em> the best hit, because one chunk wasn&#8217;t enough. <strong>A naive pipeline physically cannot do any of this.</strong> It gets one shot at one query and is done. This is the exact failure I opened with, and here&#8217;s the architecture that removes it.</p><div><hr></div><h2>Part 4, The tools: the agent&#8217;s hands</h2><p>The agent can only do what its tools let it. I gave it four, deliberately small:</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!8FpK!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9436a9ad-a1ad-4c5f-96f7-fa36f07988db_764x220.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!8FpK!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9436a9ad-a1ad-4c5f-96f7-fa36f07988db_764x220.png 424w, https://substackcdn.com/image/fetch/$s_!8FpK!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9436a9ad-a1ad-4c5f-96f7-fa36f07988db_764x220.png 848w, https://substackcdn.com/image/fetch/$s_!8FpK!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9436a9ad-a1ad-4c5f-96f7-fa36f07988db_764x220.png 1272w, https://substackcdn.com/image/fetch/$s_!8FpK!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9436a9ad-a1ad-4c5f-96f7-fa36f07988db_764x220.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!8FpK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9436a9ad-a1ad-4c5f-96f7-fa36f07988db_764x220.png" width="764" height="220" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9436a9ad-a1ad-4c5f-96f7-fa36f07988db_764x220.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:220,&quot;width&quot;:764,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:33332,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://mlnotes.substack.com/i/203402634?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9436a9ad-a1ad-4c5f-96f7-fa36f07988db_764x220.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!8FpK!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9436a9ad-a1ad-4c5f-96f7-fa36f07988db_764x220.png 424w, https://substackcdn.com/image/fetch/$s_!8FpK!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9436a9ad-a1ad-4c5f-96f7-fa36f07988db_764x220.png 848w, https://substackcdn.com/image/fetch/$s_!8FpK!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9436a9ad-a1ad-4c5f-96f7-fa36f07988db_764x220.png 1272w, https://substackcdn.com/image/fetch/$s_!8FpK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9436a9ad-a1ad-4c5f-96f7-fa36f07988db_764x220.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p><code>get_context</code> is the quiet hero. When a search hit is relevant but too short to fully answer from, the agent fetches the surrounding chunks (using those positional IDs from Part 1) instead of dumping the entire 40-page document into the prompt. It reads deeply, but only where it&#8217;s worth it, which keeps the context window (and your token bill) under control.</p><p>There&#8217;s a hard budget enforced around every retrieval, too: once retrieved content crosses ~8,000 tokens, further searches are blocked and the model is told to answer with what it has. Agents left unsupervised will happily search forever.</p><div><hr></div><h2>Part 5, The part that builds trust: citation validation</h2><p>This is the feature I&#8217;d fight to keep. An agent that retrieves well is useful. An agent that <em>can&#8217;t lie about its sources</em> is something you can put in front of users.</p><p>After the model writes an answer, a post-hook checks every <code>[chunk_id]</code> citation against the set of chunks actually retrieved during this session:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;fdcb0f34-ca92-4973-a855-126087f31f0c&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">def validate_citation(answer, retrieved_ids):
    cited = parse_cited_ids(answer)               # IDs in [brackets]
    unknown = cited - retrieved_ids               # cited but never retrieved
    if unknown:
        return (f"Citation error: {unknown} were not in your retrieved results. "
                f"Valid chunk IDs you can cite: {sorted(retrieved_ids)[:6]}. ")
    return None   </code></pre></div><p>If the model invents a citation, or cites something it merely &#8220;remembered&#8221; from training instead of retrieved, the answer is <strong>rejected</strong>. The correction goes back into the loop and the model has to revise, up to a few attempts before it gives up gracefully. The result: every bracketed citation in a final answer is guaranteed to point at real, retrieved source text. No silent hallucinated references.</p><p>(Getting this right took a few tries. My first version&#8217;s regex also matched footnote markers like <code>[1]</code> and <code>[Smith 2023]</code> <em>inside</em> the document text, poisoning the set of &#8220;valid&#8221; IDs and sending the agent into an apology spiral. Validation logic that runs on model output is worth testing carefully, the failure modes are subtle.)</p><div><hr></div><h2>Multi-provider, almost for free</h2><p>One design choice paid off repeatedly: every LLM call goes through LiteLLM, so the provider is just a config string.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;c72ea521-0aab-4241-910e-322b2fcde88b&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">PROVIDERS = {
    "gemini-fast":     ProviderConfig(model="gemini/gemini-2.5-flash"),   # default
    "anthropic-smart": ProviderConfig(model="claude-sonnet-4-6", enable_thinking=True),
    "openai-fast":     ProviderConfig(model="gpt-4o-mini"),
    # ...
}</code></pre></div><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;bash&quot;,&quot;nodeId&quot;:&quot;1e0e0137-dcb3-4186-961e-c90e285d22da&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-bash">python main.py --provider anthropic-smart "What are the key risks?"</code></pre></div><p>Same agent loop, same tools, same validation, swap the brain underneath. This is also how you A/B a cheap-and-fast model against a smart-and-slow one on your actual workload, which is the only benchmark that matters. (Each provider has its quirks, Gemini, for instance, tacks hidden &#8220;thought signature&#8221; tokens onto tool-call IDs mid-stream, which you have to strip before the conversation history will round-trip. The abstraction smooths the API, not the behavior.)</p><div><hr></div><h2>Seeing it think</h2><p>A loop you can&#8217;t observe is a loop you can&#8217;t debug or trust. The agent streams every step, tool calls, arguments, results, and the final answer, to whatever UI is attached: a Rich-powered terminal view, or a Chainlit browser UI that renders each tool call as an expandable step above the streamed answer.</p><p>It matters more than it sounds. Watching the agent decide to run a <em>second</em> search because the first didn&#8217;t cover the whole question is the moment &#8220;agentic RAG&#8221; stops being a buzzword and starts being an obviously-better design.</p><div><hr></div><h2>What I&#8217;d tell you before you build your own</h2><p>A few takeaways from building this:</p><ul><li><p><strong>The loop is the whole idea.</strong> Everything else, hybrid search, reranking, </p><p>budgets, is quality tuning. The thing that makes retrieval <em>agentic</em> is ten </p><p>lines: decide, act, observe, repeat. That&#8217;s the line between &#8220;doesn&#8217;t work in</p><p>production&#8221; and &#8220;does.&#8221;</p></li><li><p><strong>Hybrid search is non-negotiable for real documents.</strong> Vector-only retrieval </p><p>will embarrass you the first time someone searches for an exact term or an</p><p>acronym.</p></li><li><p><strong>Validate the model&#8217;s output, not just its input.</strong> Citation checking turned a </p><p>&#8220;usually grounded&#8221; system into a &#8220;provably grounded&#8221; one, and it&#8217;s maybe 30 lines.</p></li><li><p><strong>Make it observable from day one.</strong> Most of my hardest bugs were invisible until I could watch the steps stream by.</p></li><li><p><strong>Keep the tools small and sharp.</strong> Four focused tools beat one giant</p><p>do-everything tool the model has to reason about.</p></li></ul><h2>Stop shipping naive RAG</h2><p>If there&#8217;s one thing to take from this: <strong>the one-shot retrieve-then-generate pipeline is a dead end for anything real.</strong> It demos beautifully and fails quietly, wrong on multi-part questions, blind to exact terms, and happy to hallucinate a citation when the docs come up empty. You can&#8217;t tune your way out of a structural flaw.</p><p>Agentic RAG fixes it at the root by making r<strong>etrieval a decision the model makes, not a step that happens to it.</strong> The model searches when it needs to, searches <em>again</em> when the first attempt falls short, reads deeper only where it pays off, and can&#8217;t claim a source it didn&#8217;t pull. None of it is exotic, it&#8217;s a loop, two indexes, and a validator. But that combination is the difference between a demo and a system you&#8217;d actually put in front of users.</p><p>If you&#8217;re still building one-shot RAG, you&#8217;re building the version that breaks. Build the agent instead.</p><div><hr></div><p><em>Built with LiteLLM, LanceDB, BM25 + cross-encoder reranking, and a strict citation validator. The full project is on GitHub <a href="https://github.com/mallahyari/rag-agent-harness">mallahyari/rag-agent-harness</a>, and if you want the annotated, step-through version of the example trace above, there&#8217;s an interactive walkthrough in the project repo, <a href="https://github.com/mallahyari/rag-agent-harness/blob/main/demo.html">interactive walkthrough</a>.</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!RtH0!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe583dcd9-8d71-49f2-82cf-6510b9a9a5be_800x450.gif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!RtH0!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe583dcd9-8d71-49f2-82cf-6510b9a9a5be_800x450.gif 424w, https://substackcdn.com/image/fetch/$s_!RtH0!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe583dcd9-8d71-49f2-82cf-6510b9a9a5be_800x450.gif 848w, https://substackcdn.com/image/fetch/$s_!RtH0!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe583dcd9-8d71-49f2-82cf-6510b9a9a5be_800x450.gif 1272w, https://substackcdn.com/image/fetch/$s_!RtH0!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe583dcd9-8d71-49f2-82cf-6510b9a9a5be_800x450.gif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!RtH0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe583dcd9-8d71-49f2-82cf-6510b9a9a5be_800x450.gif" width="800" height="450" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e583dcd9-8d71-49f2-82cf-6510b9a9a5be_800x450.gif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:450,&quot;width&quot;:800,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:3887711,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/gif&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://mlnotes.substack.com/i/203402634?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe583dcd9-8d71-49f2-82cf-6510b9a9a5be_800x450.gif&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!RtH0!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe583dcd9-8d71-49f2-82cf-6510b9a9a5be_800x450.gif 424w, https://substackcdn.com/image/fetch/$s_!RtH0!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe583dcd9-8d71-49f2-82cf-6510b9a9a5be_800x450.gif 848w, https://substackcdn.com/image/fetch/$s_!RtH0!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe583dcd9-8d71-49f2-82cf-6510b9a9a5be_800x450.gif 1272w, https://substackcdn.com/image/fetch/$s_!RtH0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe583dcd9-8d71-49f2-82cf-6510b9a9a5be_800x450.gif 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p>]]></content:encoded></item><item><title><![CDATA[Secure Playgrounds: Sandboxing & Execution Security in Harness Engineering]]></title><description><![CDATA[In Part 2, we built a working agent harness with three real tools, read_file, write_file, and run_bash. The feedback loop worked. Errors came back as structured signals. The model self-corrected.]]></description><link>https://mlnotes.substack.com/p/secure-playgrounds-sandboxing-and</link><guid isPermaLink="false">https://mlnotes.substack.com/p/secure-playgrounds-sandboxing-and</guid><dc:creator><![CDATA[Mehdi Allahyari]]></dc:creator><pubDate>Mon, 15 Jun 2026 13:03:58 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!9hiD!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05173156-a99c-4f85-ba28-436711253ea1_1376x768.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!9hiD!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05173156-a99c-4f85-ba28-436711253ea1_1376x768.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!9hiD!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05173156-a99c-4f85-ba28-436711253ea1_1376x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!9hiD!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05173156-a99c-4f85-ba28-436711253ea1_1376x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!9hiD!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05173156-a99c-4f85-ba28-436711253ea1_1376x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!9hiD!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05173156-a99c-4f85-ba28-436711253ea1_1376x768.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!9hiD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05173156-a99c-4f85-ba28-436711253ea1_1376x768.jpeg" width="1376" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/05173156-a99c-4f85-ba28-436711253ea1_1376x768.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1376,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:623531,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://mlnotes.substack.com/i/202065850?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05173156-a99c-4f85-ba28-436711253ea1_1376x768.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!9hiD!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05173156-a99c-4f85-ba28-436711253ea1_1376x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!9hiD!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05173156-a99c-4f85-ba28-436711253ea1_1376x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!9hiD!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05173156-a99c-4f85-ba28-436711253ea1_1376x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!9hiD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05173156-a99c-4f85-ba28-436711253ea1_1376x768.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>In <a href="https://mlnotes.substack.com/p/inside-the-machine-the-anatomy-of">Part 2</a>, we built a working agent harness with three real tools, <code>read_file</code>, <code>write_file</code>, and <code>run_bash</code>. The feedback loop worked. Errors came back as structured signals. The model self-corrected.</p><p>But I left something unaddressed on purpose, and it is time to confront it directly.</p><p>Our <code>run_bash</code> function was a <code>subprocess.run()</code> call on your local machine. No container. No isolation. No boundary between the model&#8217;s generated code and your host filesystem, environment variables, and network. If the agent wrote <code>import os; os.environ</code> into a script and executed it, it would see your API keys. If it ran <code>curl</code> to an attacker-controlled server, nothing would stop it.</p><p>This is not a hypothetical. In October 2024, security researcher Johann Rehberger demonstrated what he called the <strong>ZombAIs</strong> attack against Claude&#8217;s Computer Use feature. Using prompt injection embedded in a webpage the agent was browsing, he caused the agent to download and execute the Sliver C2 framework, effectively turning the AI into a remotely controlled zombie on the host machine, with no malicious intent from the user who launched it.</p><p>Giving an AI agent a shell is handing it a loaded weapon. Today we are going to look at how a harness builds a secure playground around that weapon.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://mlnotes.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">The MLnotes Newsletter is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><h3>The Golden Rule: The Harness Belongs Outside the Sandbox</h3><p>Before containers or virtualization, one architectural principle must be non-negotiable: <strong>the agent harness must never run in the same execution context as the agent&#8217;s actions.</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!iVob!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a948a57-28c4-49aa-a211-8410248db6d1_2053x834.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!iVob!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a948a57-28c4-49aa-a211-8410248db6d1_2053x834.png 424w, https://substackcdn.com/image/fetch/$s_!iVob!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a948a57-28c4-49aa-a211-8410248db6d1_2053x834.png 848w, https://substackcdn.com/image/fetch/$s_!iVob!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a948a57-28c4-49aa-a211-8410248db6d1_2053x834.png 1272w, https://substackcdn.com/image/fetch/$s_!iVob!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a948a57-28c4-49aa-a211-8410248db6d1_2053x834.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!iVob!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a948a57-28c4-49aa-a211-8410248db6d1_2053x834.png" width="1456" height="591" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1a948a57-28c4-49aa-a211-8410248db6d1_2053x834.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:591,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:143558,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://mlnotes.substack.com/i/202065850?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a948a57-28c4-49aa-a211-8410248db6d1_2053x834.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!iVob!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a948a57-28c4-49aa-a211-8410248db6d1_2053x834.png 424w, https://substackcdn.com/image/fetch/$s_!iVob!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a948a57-28c4-49aa-a211-8410248db6d1_2053x834.png 848w, https://substackcdn.com/image/fetch/$s_!iVob!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a948a57-28c4-49aa-a211-8410248db6d1_2053x834.png 1272w, https://substackcdn.com/image/fetch/$s_!iVob!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a948a57-28c4-49aa-a211-8410248db6d1_2053x834.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>If your harness runs <em>inside</em> the sandbox, a compromised model execution can access your environment variables, modify your history logic, or exfiltrate your API keys. The harness is the warden; the sandbox is the cell. The warden must command from outside &#8212; passing instructions in, receiving exit codes and stdout back out.</p><div><hr></div><h3>The Sandboxing Spectrum: Four Isolation Models</h3><p>Every sandboxing decision involves the same trade-off: <strong>isolation security vs. startup latency.</strong> Here is where the four primary models land in practice today:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!FdR4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4264226d-a491-4d42-acc5-0142d1bbd20c_1960x602.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!FdR4!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4264226d-a491-4d42-acc5-0142d1bbd20c_1960x602.png 424w, https://substackcdn.com/image/fetch/$s_!FdR4!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4264226d-a491-4d42-acc5-0142d1bbd20c_1960x602.png 848w, https://substackcdn.com/image/fetch/$s_!FdR4!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4264226d-a491-4d42-acc5-0142d1bbd20c_1960x602.png 1272w, https://substackcdn.com/image/fetch/$s_!FdR4!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4264226d-a491-4d42-acc5-0142d1bbd20c_1960x602.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!FdR4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4264226d-a491-4d42-acc5-0142d1bbd20c_1960x602.png" width="1456" height="447" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4264226d-a491-4d42-acc5-0142d1bbd20c_1960x602.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:447,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:171171,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://mlnotes.substack.com/i/202065850?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4264226d-a491-4d42-acc5-0142d1bbd20c_1960x602.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!FdR4!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4264226d-a491-4d42-acc5-0142d1bbd20c_1960x602.png 424w, https://substackcdn.com/image/fetch/$s_!FdR4!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4264226d-a491-4d42-acc5-0142d1bbd20c_1960x602.png 848w, https://substackcdn.com/image/fetch/$s_!FdR4!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4264226d-a491-4d42-acc5-0142d1bbd20c_1960x602.png 1272w, https://substackcdn.com/image/fetch/$s_!FdR4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4264226d-a491-4d42-acc5-0142d1bbd20c_1960x602.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>A few clarifications worth calling out explicitly:</p><p><strong>WebAssembly &amp; Pyodide</strong>: The common claim that Wasm gives you sub-10ms startup is wrong for Python. Pyodide (Python compiled to Wasm) requires downloading ~6 MB of runtime and takes 1&#8211;3 seconds to initialize. Once running, it is also 3&#8211;5x slower than native CPython. The real advantage is mathematical isolation. Pyodide code cannot escape its Wasm memory boundary by design. It is a good fit for lightweight, browser-based execution but not for general-purpose agent tool calls.</p><p><strong>Docker + gVisor</strong>: gVisor intercepts system calls in user space rather than passing them to the host kernel directly. This eliminates the biggest Docker security risk (kernel-sharing) while keeping Docker&#8217;s ergonomics. Google runs gVisor in production for Cloud Run. The tradeoff is ~10&#8211;20% runtime overhead and some syscall compatibility gaps.</p><p><strong>Firecracker MicroVMs</strong>: Used by AWS Lambda and E2B (a cloud sandbox platform for AI agents processing ~15 million sandboxes/month as of 2025). Each agent gets its own kernel, not just a container namespace. Cold boot is ~90&#8211;200ms, and with VM snapshotting it drops to ~150ms for pre-warmed states. This is the production standard for hosted coding agents.</p><div><hr></div><h3>Filesystem Isolation: Setting Clear Boundaries</h3><p>An agent refactoring a codebase needs file access. The question is <em>which</em> files.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!uCMy!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe488108b-ed35-4537-9802-f30fecfba5ba_2322x742.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!uCMy!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe488108b-ed35-4537-9802-f30fecfba5ba_2322x742.png 424w, https://substackcdn.com/image/fetch/$s_!uCMy!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe488108b-ed35-4537-9802-f30fecfba5ba_2322x742.png 848w, https://substackcdn.com/image/fetch/$s_!uCMy!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe488108b-ed35-4537-9802-f30fecfba5ba_2322x742.png 1272w, https://substackcdn.com/image/fetch/$s_!uCMy!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe488108b-ed35-4537-9802-f30fecfba5ba_2322x742.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!uCMy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe488108b-ed35-4537-9802-f30fecfba5ba_2322x742.png" width="1456" height="465" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e488108b-ed35-4537-9802-f30fecfba5ba_2322x742.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:465,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:117360,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://mlnotes.substack.com/i/202065850?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe488108b-ed35-4537-9802-f30fecfba5ba_2322x742.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!uCMy!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe488108b-ed35-4537-9802-f30fecfba5ba_2322x742.png 424w, https://substackcdn.com/image/fetch/$s_!uCMy!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe488108b-ed35-4537-9802-f30fecfba5ba_2322x742.png 848w, https://substackcdn.com/image/fetch/$s_!uCMy!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe488108b-ed35-4537-9802-f30fecfba5ba_2322x742.png 1272w, https://substackcdn.com/image/fetch/$s_!uCMy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe488108b-ed35-4537-9802-f30fecfba5ba_2322x742.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Three rules make this work in practice:</p><p><strong>1. Mount explicitly, never from root.</strong> Never bind-mount <code>/</code> or <code>~</code>. Only mount the specific directory the agent is assigned to work in.</p><p><strong>2. Use ephemeral copies for high-risk tasks.</strong> Copy the target repository into a temporary path (<code>/tmp/agent-run-xyz</code>) and mount that instead. When the agent finishes, diff the changes, present them to the user, then destroy the container. The original is never touched directly.</p><p><strong>3. Run as a non-root user.</strong> Always run the container with a non-privileged user (<code>--user 1000:1000</code>). This prevents model-generated code from installing kernel modules, modifying network routes, or writing to system directories even if a container escape is attempted.</p><div><hr></div><h3>From Subprocess to Docker: Upgrading Your Part 2 Harness</h3><p>In Part 2, the <code>run_bash</code> function was a bare <code>subprocess.run()</code>. Here is what the upgrade looks like &#8212; a drop-in Docker replacement that applies all the isolation rules above:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;b2228c33-1482-4e01-b2b6-f88fb9022ddf&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">import subprocess

def run_in_docker_sandbox(command: str, workspace_path: str) -&gt; tuple[str, bool]:
    """
    Execute a command inside an ephemeral Docker container.
    Drop-in replacement for the bare subprocess run_bash from Part 2.
    """
    docker_cmd = [
        "docker", "run",
        "--rm",                                      # destroy container on exit
        "--network", "none",                         # no network access by default
        "--memory", "512m",                          # hard memory cap
        "--cpus", "1.0",                             # hard CPU cap
        "--read-only",                               # root filesystem is read-only
        "--tmpfs", "/tmp:size=100m",                 # writable scratch space only
        "-v", f"{workspace_path}:/workspace:rw",    # mount only the workspace
        "-w", "/workspace",                          # set working directory
        "--user", "1000:1000",                       # non-root user
        "python:3.11-slim",
        "bash", "-c", f"timeout 30 {command}",      # inner timeout
    ]

    try:
        result = subprocess.run(
            docker_cmd,
            capture_output=True,
            text=True,
            timeout=35,    # outer timeout slightly longer than inner
        )
        output = (result.stdout + result.stderr).strip()
        return output or "(no output)", result.returncode != 0
    except subprocess.TimeoutExpired:
        return "Error: sandbox timed out", True
    except Exception as e:
        return f"Error: {e}", True</code></pre></div><p>Notice what each flag does:</p><ul><li><p><code>--rm</code> ensures the container is destroyed after each tool call, no state leaks between runs</p></li><li><p><code>--network none</code> cuts off all external network access entirely</p></li><li><p><code>--read-only</code> + <code>--tmpfs</code> means the agent can only write to <code>/tmp</code> and the mounted workspace, nothing else on the filesystem</p></li><li><p><code>--user 1000:1000</code> ensures model-generated code runs without root privileges</p></li><li><p>The double timeout (inner <code>timeout 30</code> + outer <code>timeout=35</code>) guarantees the harness is never blocked by a runaway process</p></li></ul><p>To plug this into the <code>AgentHarness</code> from Part 2, replace the <code>run_bash</code> function in <code>TOOL_REGISTRY</code> with a lambda that calls <code>run_in_docker_sandbox</code> with your workspace path.</p><div><hr></div><h3>Ready-Made Sandboxing Libraries</h3><p>If you do not want to manage Docker configuration yourself, several open-source libraries handle the heavy lifting. Here are three worth knowing, each covering a different point on the isolation spectrum:</p><h4>E2B (<code>e2b-code-interpreter</code>)</h4><p><a href="https://github.com/e2b-dev/code-interpreter">E2B</a> is the most production-ready option for AI agent sandboxing. Under the hood it uses Firecracker microVMs &#8212; each sandbox gets its own kernel. The Python SDK makes it a near drop-in replacement for the Docker function above:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;c533eaf6-4387-41ad-97f5-37391b65ae75&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">from e2b_code_interpreter import Sandbox

def run_in_e2b_sandbox(command: str) -&gt; tuple[str, bool]:
    with Sandbox() as sandbox:
        result = sandbox.run_code(command)
        output = "\n".join(str(o) for o in result.logs.stdout)
        error  = "\n".join(str(e) for e in result.logs.stderr)
        is_error = bool(result.error)
        return (error if is_error else output) or "(no output)", is_error</code></pre></div><p>Install with <code>pip install e2b-code-interpreter</code>. Requires an E2B API key. Best choice if you are building a cloud-hosted agent and want hardware-level isolation without managing infrastructure.</p><h4>RestrictedPython</h4><p><a href="https://github.com/zopefoundation/RestrictedPython">RestrictedPython</a> takes a different approach &#8212; rather than isolating at the OS level, it restricts what Python code is <em>allowed to do</em> at parse time. You define exactly which builtins, imports, and operations are permitted before the code ever runs.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;da368e1c-79e9-42a9-9ee8-e83e4fbe34a8&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">from RestrictedPython import compile_restricted, safe_globals

def run_restricted_python(code: str) -&gt; tuple[str, bool]:
    try:
        byte_code = compile_restricted(code, "&lt;string&gt;", "exec")
        local_vars = {}
        exec(byte_code, safe_globals, local_vars)
        return str(local_vars.get("result", "(no result)")), False
    except Exception as e:
        return f"{type(e).__name__}: {e}", True</code></pre></div><p>Install with <code>pip install RestrictedPython</code>. No containers needed &#8212; useful when Docker is overkill and you only need to prevent agents from importing <code>os</code>, <code>subprocess</code>, or <code>sys</code>. Not a replacement for OS-level isolation for untrusted code, but a solid lightweight layer for constrained use cases.</p><h4>Monty (<code>pydantic-monty</code>)</h4><p><a href="https://github.com/pydantic/monty">Monty</a> is the most interesting new entrant in this space &#8212; a minimal, secure Python interpreter written in Rust by the Pydantic team, designed specifically for running LLM-generated code. It starts in under a microsecond, requires no containers, and completely blocks access to the host filesystem, environment variables, and network by default. You control exactly which host functions the agent can call.</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;8e00a11b-679b-41ec-a612-2ed7df881ccb&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">import pydantic_monty

def run_in_monty(code: str) -&gt; tuple[str, bool]:
    try:
        result = pydantic_monty.run(code)
        return result.stdout or "(no output)", False
    except pydantic_monty.ExecutionError as e:
        return str(e), True</code></pre></div><p>Install with <code>pip install pydantic-monty</code>. The tradeoff is intentional scope &#8212; Monty runs a subset of Python and does not support third-party libraries like NumPy or Pydantic itself. It is designed for agents that express logic in pure Python rather than calling into ecosystem packages. Worth watching closely: Pydantic plans to use it as the foundation for code execution in PydanticAI, and it is still marked experimental at the time of writing.</p><h4>Deno (for JavaScript/TypeScript agents)</h4><p>If your agent executes JavaScript or TypeScript, <a href="https://deno.com/">Deno</a> has a built-in permission model that makes sandboxing a flag, not an architecture decision:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;bash&quot;,&quot;nodeId&quot;:&quot;d7830b51-0f03-408d-93c8-554cdcb34d90&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-bash"># Agent-generated script runs with explicit, minimal permissions only
deno run --allow-read=/workspace --allow-write=/workspace/output --no-prompt agent_script.ts</code></pre></div><p>Deno denies all filesystem, network, and environment access by default. You opt-in to exactly what the agent needs. Install with <code>curl -fsSL https://deno.land/install.sh | sh</code>. Best fit for TypeScript-first agent stacks. </p><div><hr></div><h3>Network Egress: The Most Overlooked Attack Surface</h3><p>Even with a containerized sandbox, unlimited internet access is a liability. A prompt-injected agent can participate in DDoS attacks, exfiltrate data, or, as ZombAIs demonstrated, phone home to a C2 server.</p><p>Two practical controls:</p><p><strong>Allowlist over blocklist.</strong> Rather than trying to block malicious domains, restrict outbound traffic to a known-good set: <code>pypi.org</code>, <code>npmjs.com</code>, <code>github.com</code>, your internal registry. Everything else is denied by default. This is far more defensible than maintaining a blocklist.</p><p><strong>Block cloud metadata endpoints.</strong> If your sandbox runs in a cloud environment (AWS, GCP, Azure), the instance metadata service at <code>169.254.169.254</code> is a prime target for SSRF attacks &#8212; a compromised agent can use it to steal IAM credentials. Block this at the network level and enforce IMDSv2 (require token-authenticated requests) at the cloud provider level. This was an active exploitation target as recently as early 2025.</p><div><hr></div><h3>Process Guardrails: Bounding Cost and Blast Radius</h3><p>Even inside an isolated container, an agent can write an infinite loop or spawn thousands of subprocesses:</p><ul><li><p><strong>CPU and memory caps:</strong> The <code>--memory</code> and <code>--cpus</code> Docker flags are your first line. Set them conservatively for agent tasks &#8212; 512 MB RAM and 1 CPU is sufficient for most code execution workloads.</p></li><li><p><strong>Process limit:</strong> Add <code>--pids-limit 100</code> to cap the number of processes the container can spawn. This stops fork bombs and runaway test runners.</p></li><li><p><strong>Disk quota:</strong> Use Docker&#8217;s <code>--storage-opt size=1G</code> (with a supported storage driver) to cap how much the agent can write to the mounted workspace.</p></li></ul><div><hr></div><h3>Human-in-the-Loop as a Security Gate</h3><p>Some commands should never run without explicit approval, regardless of sandbox isolation. The harness is the right place to intercept them:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;01b33a57-9dd3-4e06-b496-c2bf2d42aafd&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">REQUIRES_APPROVAL = [
    r"curl\s+.*\|\s*(bash|sh)",   # curl pipe to shell
    r"wget\s+.*\|\s*(bash|sh)",   # wget pipe to shell
    r"git\s+push",                # pushing code
    r"pip\s+install\s+--index",   # installing from non-standard index
]

def requires_human_approval(command: str) -&gt; bool:
    import re
    return any(re.search(pattern, command) for pattern in REQUIRES_APPROVAL)</code></pre></div><p>Wire this into the pre-execution hook from Part 2. If <code>requires_human_approval()</code> returns <code>True</code>, the harness pauses and prints the command for the user to approve or deny before the sandbox sees it. The agent never knows the gate exists &#8212; it just receives a response or a timeout.</p><div><hr></div><h3>What&#8217;s Next?</h3><p>A secure, sandboxed harness is now capable of running code safely over many iterations. But as tasks grow in complexity &#8212; refactoring a large codebase, researching a topic across dozens of sources, debugging a multi-file system &#8212; the agent will start to hit a different wall: <strong>context window bloat.</strong></p><p>After 50 tool calls and thousands of lines of output, the conversation history becomes unwieldy, expensive, and eventually truncated. The agent starts to &#8220;forget&#8221; earlier decisions.</p><p>In <strong>Part 4</strong> of this series, we will tackle <strong>Managing the Long-Running Agent</strong>: context compaction, dynamic history compression, and strategies for keeping an agent coherent across hours of execution without burning through your entire token budget in the first 20 minutes.</p><div class="captioned-button-wrap" data-attrs="{&quot;url&quot;:&quot;https://mlnotes.substack.com/p/secure-playgrounds-sandboxing-and?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="CaptionedButtonToDOM"><div class="preamble"><p class="cta-caption">Thanks for reading The MLnotes Newsletter! This post is public so feel free to share it.</p></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://mlnotes.substack.com/p/secure-playgrounds-sandboxing-and?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://mlnotes.substack.com/p/secure-playgrounds-sandboxing-and?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div><p></p>]]></content:encoded></item><item><title><![CDATA[Inside the Machine: The Anatomy of an Agent Harness]]></title><description><![CDATA[In Part 1, I argued that the harness is more important than the model, that the teams shipping reliable autonomous agents win by obsessing over the infrastructure around the LLM, not just the weights inside it.]]></description><link>https://mlnotes.substack.com/p/inside-the-machine-the-anatomy-of</link><guid isPermaLink="false">https://mlnotes.substack.com/p/inside-the-machine-the-anatomy-of</guid><dc:creator><![CDATA[Mehdi Allahyari]]></dc:creator><pubDate>Sun, 07 Jun 2026 13:01:13 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!STnE!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faee80e9a-77f5-4160-87c4-2d4c54ddae08_1376x768.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!STnE!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faee80e9a-77f5-4160-87c4-2d4c54ddae08_1376x768.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!STnE!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faee80e9a-77f5-4160-87c4-2d4c54ddae08_1376x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!STnE!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faee80e9a-77f5-4160-87c4-2d4c54ddae08_1376x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!STnE!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faee80e9a-77f5-4160-87c4-2d4c54ddae08_1376x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!STnE!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faee80e9a-77f5-4160-87c4-2d4c54ddae08_1376x768.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!STnE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faee80e9a-77f5-4160-87c4-2d4c54ddae08_1376x768.jpeg" width="1376" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/aee80e9a-77f5-4160-87c4-2d4c54ddae08_1376x768.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1376,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:969896,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://mlnotes.substack.com/i/200945841?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faee80e9a-77f5-4160-87c4-2d4c54ddae08_1376x768.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!STnE!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faee80e9a-77f5-4160-87c4-2d4c54ddae08_1376x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!STnE!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faee80e9a-77f5-4160-87c4-2d4c54ddae08_1376x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!STnE!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faee80e9a-77f5-4160-87c4-2d4c54ddae08_1376x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!STnE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faee80e9a-77f5-4160-87c4-2d4c54ddae08_1376x768.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>In Part 1, I argued that the harness is more important than the model, that the teams shipping reliable autonomous agents win by obsessing over the infrastructure <em>around</em> the LLM, not just the weights inside it.</p><p>That was the &#8220;why.&#8221; This post is the &#8220;what&#8221; and &#8220;how.&#8221;</p><p>We will deconstruct the anatomy of a production-grade agent harness, map out its execution lifecycle, and walk through a complete, working Python implementation you can run today. By the end, you will have a blueprint you can extend into your own projects.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://mlnotes.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">The MLnotes Newsletter is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><h3>The 5 Architectural Pillars of a Harness</h3><p>An agent harness is more than an infinite <code>while</code> loop calling an API. In a production environment, a resilient harness consists of five distinct components working in unison:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!h4Z1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde0bb708-6b9c-404e-b64d-cebe95b60830_2703x786.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!h4Z1!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde0bb708-6b9c-404e-b64d-cebe95b60830_2703x786.png 424w, https://substackcdn.com/image/fetch/$s_!h4Z1!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde0bb708-6b9c-404e-b64d-cebe95b60830_2703x786.png 848w, https://substackcdn.com/image/fetch/$s_!h4Z1!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde0bb708-6b9c-404e-b64d-cebe95b60830_2703x786.png 1272w, https://substackcdn.com/image/fetch/$s_!h4Z1!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde0bb708-6b9c-404e-b64d-cebe95b60830_2703x786.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!h4Z1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde0bb708-6b9c-404e-b64d-cebe95b60830_2703x786.png" width="1456" height="423" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/de0bb708-6b9c-404e-b64d-cebe95b60830_2703x786.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:423,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:167984,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://mlnotes.substack.com/i/200945841?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde0bb708-6b9c-404e-b64d-cebe95b60830_2703x786.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!h4Z1!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde0bb708-6b9c-404e-b64d-cebe95b60830_2703x786.png 424w, https://substackcdn.com/image/fetch/$s_!h4Z1!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde0bb708-6b9c-404e-b64d-cebe95b60830_2703x786.png 848w, https://substackcdn.com/image/fetch/$s_!h4Z1!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde0bb708-6b9c-404e-b64d-cebe95b60830_2703x786.png 1272w, https://substackcdn.com/image/fetch/$s_!h4Z1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde0bb708-6b9c-404e-b64d-cebe95b60830_2703x786.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h4>1. The State Controller &amp; Continuation Loop</h4><p>An agent operating in the wild cannot be ephemeral. If a network request drops or a rate limit is hit, the agent must resume exactly where it left off. The State Controller maintains a durable, append-only conversation history, every prompt sent, every tool result received, so that each new step picks up with full context intact.</p><h4>2. The Tool Registry and Schema Manager</h4><p>Models do not natively understand APIs, file systems, or shells. The Tool Registry translates between model intent and real-world system interfaces by maintaining a catalogue of <strong>declarative tool schemas</strong>, structured definitions the model reads to understand what a tool does and what parameters it expects. The registry also validates inputs <em>before</em> execution, rejecting malformed calls at the harness level and saving expensive API roundtrips.</p><h4>3. The Execution Sandbox</h4><p>A raw terminal is a liability. A robust harness isolates every tool call: commands run inside constrained environments (Docker containers, microVMs, or at minimum a subprocess with strict timeouts and blocked patterns) rather than directly on your host. In our implementation we use a subprocess with a blocklist and timeout, and we are honest about what a production sandbox looks like in Part 3.</p><h4>4. Deterministic Middleware &amp; Lifecycle Hooks</h4><p>Hooks intercept every tool call before and after execution, the same pattern HTTP frameworks use for authentication and logging middleware. <strong>Pre-execution hooks</strong> can block or rewrite a call (a linter catching bad syntax before wasting a compile cycle). <strong>Post-execution hooks</strong> can run tests and return pass/fail as feedback.</p><h4>5. Boundary Controls &amp; Guardrails</h4><p>The harness is the last line of defense against runaway costs and infinite loops. Hard iteration caps guarantee a termination point regardless of what the model decides internally. Token budgets and timeout limits enforce financial boundaries.</p><div><hr></div><h3>The Lifecycle of a Single Step</h3><p>Before reading the code, it helps to see how these five pillars interact during a single execution step:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!vCSD!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3dbea14-7f4b-442c-9446-cb21f3c9173e_1153x2286.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!vCSD!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3dbea14-7f4b-442c-9446-cb21f3c9173e_1153x2286.png 424w, https://substackcdn.com/image/fetch/$s_!vCSD!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3dbea14-7f4b-442c-9446-cb21f3c9173e_1153x2286.png 848w, https://substackcdn.com/image/fetch/$s_!vCSD!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3dbea14-7f4b-442c-9446-cb21f3c9173e_1153x2286.png 1272w, https://substackcdn.com/image/fetch/$s_!vCSD!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3dbea14-7f4b-442c-9446-cb21f3c9173e_1153x2286.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!vCSD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3dbea14-7f4b-442c-9446-cb21f3c9173e_1153x2286.png" width="1153" height="2286" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c3dbea14-7f4b-442c-9446-cb21f3c9173e_1153x2286.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:2286,&quot;width&quot;:1153,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:193870,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://mlnotes.substack.com/i/200945841?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3dbea14-7f4b-442c-9446-cb21f3c9173e_1153x2286.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!vCSD!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3dbea14-7f4b-442c-9446-cb21f3c9173e_1153x2286.png 424w, https://substackcdn.com/image/fetch/$s_!vCSD!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3dbea14-7f4b-442c-9446-cb21f3c9173e_1153x2286.png 848w, https://substackcdn.com/image/fetch/$s_!vCSD!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3dbea14-7f4b-442c-9446-cb21f3c9173e_1153x2286.png 1272w, https://substackcdn.com/image/fetch/$s_!vCSD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3dbea14-7f4b-442c-9446-cb21f3c9173e_1153x2286.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The critical insight is in the bottom path: errors are <strong>not</strong> crashes. The harness intercepts them, formats them as structured feedback, appends them to the conversation history, and loops, giving the model the information it needs to self-correct on the next step.</p><div><hr></div><h3>The Complete Implementation</h3><p>Here is a fully working, minimal agent harness in Python. It uses the Anthropic API with real tool execution, a proper tool registry, lifecycle hooks, and boundary controls, but you can replace it with any other LLM provider or even your own local model. You can run this with <code>pip install anthropic</code>.</p><p>Before running, set your API key:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;bash&quot;,&quot;nodeId&quot;:&quot;203a6912-57b6-49c9-b00e-aaced0132b14&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-bash">export ANTHROPIC_API_KEY=your-key-here</code></pre></div><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;e6d2cb54-b5c4-4990-81ad-2a43d78a48b7&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">#!/usr/bin/env python3
"""
Minimal Agent Harness &#8212; Educational Implementation

Demonstrates the 5 architectural pillars of a production harness:
  1. State Controller &amp; Continuation Loop
  2. Tool Registry &amp; Schema Manager
  3. Execution Sandbox (subprocess with blocklist + timeout)
  4. Middleware &amp; Lifecycle Hooks (pre/post interceptors)
  5. Boundary Controls &amp; Guardrails (iteration cap, timeout)
"""

import json
import subprocess
from pathlib import Path
import anthropic


# &#9472;&#9472;&#9472; PILLAR 3: EXECUTION SANDBOX &#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;
# These are the actual tool implementations. In a production harness these
# would run inside Docker or a microVM. Here we use subprocess with a blocklist
# and a hard timeout as a minimal safety layer.

BLOCKED_PATTERNS = ["rm -rf", "sudo", ":(){:|:&amp;};:", "&gt; /dev/", "mkfs", "dd if="]

def read_file(path: str) -&gt; str:
    try:
        return Path(path).read_text()
    except FileNotFoundError:
        return f"Error: file not found: {path}"
    except Exception as e:
        return f"Error: {e}"

def write_file(path: str, content: str) -&gt; str:
    try:
        Path(path).parent.mkdir(parents=True, exist_ok=True)
        Path(path).write_text(content)
        return f"Wrote {len(content)} characters to {path}"
    except Exception as e:
        return f"Error: {e}"

def run_bash(command: str) -&gt; str:
    for pattern in BLOCKED_PATTERNS:
        if pattern in command:
            return f"Blocked: command contains disallowed pattern '{pattern}'"
    try:
        result = subprocess.run(
            command,
            shell=True,
            capture_output=True,
            text=True,
            timeout=30,   # Hard cap: no command runs forever
        )
        return (result.stdout + result.stderr).strip() or "(no output)"
    except subprocess.TimeoutExpired:
        return "Error: command timed out after 30 seconds"
    except Exception as e:
        return f"Error: {e}"


# &#9472;&#9472;&#9472; PILLAR 2: TOOL REGISTRY &amp; SCHEMA MANAGER &#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;
# Maps tool names to implementations and exposes Anthropic-compatible schemas.
# The model reads these schemas to understand what tools exist and how to call them.

TOOL_REGISTRY = {
    "read_file":  read_file,
    "write_file": write_file,
    "run_bash":   run_bash,
}

TOOL_SCHEMAS = [
    {
        "name": "read_file",
        "description": "Read the full contents of a file at the given path.",
        "input_schema": {
            "type": "object",
            "properties": {
                "path": {"type": "string", "description": "Relative or absolute file path"}
            },
            "required": ["path"],
        },
    },
    {
        "name": "write_file",
        "description": "Write text content to a file, creating it if it does not exist.",
        "input_schema": {
            "type": "object",
            "properties": {
                "path":    {"type": "string", "description": "File path to write to"},
                "content": {"type": "string", "description": "Text content to write"},
            },
            "required": ["path", "content"],
        },
    },
    {
        "name": "run_bash",
        "description": (
            "Execute a shell command and return its output. "
            "Use for running scripts, installing packages, or checking system state. "
            "Destructive commands are blocked."
        ),
        "input_schema": {
            "type": "object",
            "properties": {
                "command": {"type": "string", "description": "The bash command to run"}
            },
            "required": ["command"],
        },
    },
]


# &#9472;&#9472;&#9472; THE AGENT HARNESS &#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;

class AgentHarness:
    """
    A minimal, self-correcting agent harness.

    All five pillars are implemented:
      1. history[] maintains durable state across iterations
      2. TOOL_SCHEMAS + TOOL_REGISTRY handle registration and dispatch
      3. _dispatch_tool() runs tools in the sandbox
      4. _handle_tool_calls() wraps each call with pre/post hooks
      5. max_iterations enforces a hard termination boundary
    """

    def __init__(self, system_prompt: str, max_iterations: int = 10):
        self.client = anthropic.Anthropic()
        self.system_prompt = system_prompt
        self.max_iterations = max_iterations

        # PILLAR 1: durable state &#8212; the full conversation lives here
        self.history: list[dict] = []
        self.iteration = 0

    # &#9472;&#9472; PUBLIC INTERFACE &#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;

    def run(self, task: str) -&gt; str:
        """Run the agent on a task and return its final text response."""
        print(f"\n[Harness] Task: {task[:80]}...")
        self.history.append({"role": "user", "content": task})

        while self.iteration &lt; self.max_iterations:
            self.iteration += 1
            print(f"\n[Harness] &#9472;&#9472; Step {self.iteration}/{self.max_iterations} &#9472;&#9472;")

            response = self.client.messages.create(
                model="claude-opus-4-7",
                max_tokens=4096,
                system=self.system_prompt,
                tools=TOOL_SCHEMAS,
                messages=self.history,
            )

            if response.stop_reason == "end_turn":
                print("[Harness] Goal reached.")
                return self._extract_text(response)

            if response.stop_reason == "tool_use":
                # PILLAR 4: middleware intercepts every tool call
                self._handle_tool_calls(response)

        # PILLAR 5: hard boundary &#8212; guaranteed termination
        print("[Harness] Iteration limit reached. Halting.")
        return "Stopped: maximum iteration limit reached."

    # &#9472;&#9472; PRIVATE METHODS &#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;

    def _handle_tool_calls(self, response) -&gt; None:
        """Execute all tool calls in a response and feed results back into history."""

        # Save the full assistant message (including tool_use blocks) to history
        self.history.append({"role": "assistant", "content": response.content})

        tool_results = []
        for block in response.content:
            if block.type != "tool_use":
                continue

            # PRE-EXECUTION HOOK: log, validate, or block before anything runs
            print(f"[Hook:pre]  &#8594; {block.name}({json.dumps(block.input)[:100]})")

            result, is_error = self._dispatch_tool(block.name, block.input)

            # POST-EXECUTION HOOK: log outcome; extend here for test runners, linters, etc.
            status = "error" if is_error else "ok"
            print(f"[Hook:post] &#8592; [{status}] {result[:100]}")

            tool_results.append({
                "type": "tool_result",
                "tool_use_id": block.id,
                "content": result,
                "is_error": is_error,   # tells the model this is an error to recover from
            })

        # Append results as a user message &#8212; this closes the feedback loop.
        # The model sees tool outputs (or errors) on the next iteration and
        # decides whether to retry, adjust, or proceed.
        self.history.append({"role": "user", "content": tool_results})

    def _dispatch_tool(self, name: str, args: dict) -&gt; tuple[str, bool]:
        """Look up a tool from the registry and call it. Returns (output, is_error)."""
        tool_fn = TOOL_REGISTRY.get(name)
        if tool_fn is None:
            return f"Unknown tool: '{name}'", True
        try:
            return str(tool_fn(**args)), False
        except Exception as e:
            # Exceptions become structured feedback &#8212; not application crashes
            return f"{type(e).__name__}: {e}", True

    @staticmethod
    def _extract_text(response) -&gt; str:
        for block in response.content:
            if hasattr(block, "text"):
                return block.text
        return ""


# &#9472;&#9472;&#9472; ENTRYPOINT &#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;&#9472;

SYSTEM_PROMPT = """\
You are a coding assistant with access to a filesystem and a bash shell.
Complete tasks step by step. Always verify your work by running it.
If something fails, read the error carefully and fix it before moving on.\
"""

if __name__ == "__main__":
    harness = AgentHarness(system_prompt=SYSTEM_PROMPT, max_iterations=10)

    result = harness.run(
        "Write a Python function that returns the nth Fibonacci number, "
        "save it to fibonacci.py, then run it with n=10 to verify it works."
    )

    print(f"\n[Final Answer]\n{result}")</code></pre></div><h3>What Makes This a Harness, and Not Just a Wrapper?</h3><p>Looking at the code above, three behaviors stand out that elevate it beyond a simple API call:</p><p><strong>1. The application never crashes on tool failure.</strong></p><p>Any exception from <code>read_file</code>, <code>write_file</code>, or <code>run_bash</code> is caught inside <code>_dispatch_tool</code> and returned as a structured string with <code>is_error=True</code>. The model receives it as feedback, not as a Python traceback that kills the process. If the agent writes broken code and <code>run_bash</code> returns a <code>SyntaxError</code>, the harness feeds that error back and the model fixes it on the next step, without any human intervention.</p><p><strong>2. Errors are formatted as instructions.</strong></p><p>Setting <code>is_error=True</code> in the tool result is not cosmetic. The Anthropic API treats it as a signal that the model should analyze what went wrong and decide how to recover. This is the self-correction loop made concrete: the model is not told <em>how</em> to fix the error, but it is given the error precisely, and its next action should be a correction.</p><p><strong>3. Termination is structurally guaranteed.</strong></p><p>The <code>while</code> loop has a hard ceiling at <code>max_iterations</code>. No matter what logic the model follows internally, retrying the same broken approach five times, generating new subtasks, calling tools in unexpected orders, the harness guarantees a termination point. This is what makes agent systems safe to run unsupervised.</p><div><hr></div><h3>Extending the Blueprint</h3><p>The implementation above is intentionally minimal. Here is where you would add more production pillars:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!HSAD!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ea5d912-439c-4726-96dd-575c6dc559d8_1408x414.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!HSAD!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ea5d912-439c-4726-96dd-575c6dc559d8_1408x414.png 424w, https://substackcdn.com/image/fetch/$s_!HSAD!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ea5d912-439c-4726-96dd-575c6dc559d8_1408x414.png 848w, https://substackcdn.com/image/fetch/$s_!HSAD!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ea5d912-439c-4726-96dd-575c6dc559d8_1408x414.png 1272w, https://substackcdn.com/image/fetch/$s_!HSAD!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ea5d912-439c-4726-96dd-575c6dc559d8_1408x414.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!HSAD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ea5d912-439c-4726-96dd-575c6dc559d8_1408x414.png" width="1408" height="414" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1ea5d912-439c-4726-96dd-575c6dc559d8_1408x414.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:414,&quot;width&quot;:1408,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:106604,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://mlnotes.substack.com/i/200945841?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ea5d912-439c-4726-96dd-575c6dc559d8_1408x414.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!HSAD!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ea5d912-439c-4726-96dd-575c6dc559d8_1408x414.png 424w, https://substackcdn.com/image/fetch/$s_!HSAD!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ea5d912-439c-4726-96dd-575c6dc559d8_1408x414.png 848w, https://substackcdn.com/image/fetch/$s_!HSAD!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ea5d912-439c-4726-96dd-575c6dc559d8_1408x414.png 1272w, https://substackcdn.com/image/fetch/$s_!HSAD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ea5d912-439c-4726-96dd-575c6dc559d8_1408x414.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div><hr></div><h3>What&#8217;s Next?</h3><p>The sandbox in our implementation, subprocess calls with a blocklist and a timeout, is a good starting point for learning, but it is not what you would ship. A production harness needs genuine isolation: the agent&#8217;s code must not be able to read your SSH keys, saturate your CPU, or exfiltrate environment variables.</p><p>In <strong>Part 3</strong> of this series, we will tackle <strong>Sandboxing &amp; Execution Security in Harness Engineering</strong>. We will look at how to wrap agent tool calls in Docker containers, handle filesystem mounts safely, and build a security layer that lets your agent run with real permissions without putting your host at risk.</p><div class="captioned-button-wrap" data-attrs="{&quot;url&quot;:&quot;https://mlnotes.substack.com/p/inside-the-machine-the-anatomy-of?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="CaptionedButtonToDOM"><div class="preamble"><p class="cta-caption">Thanks for reading The MLnotes Newsletter! This post is public so feel free to share it.</p></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://mlnotes.substack.com/p/inside-the-machine-the-anatomy-of?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://mlnotes.substack.com/p/inside-the-machine-the-anatomy-of?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div>]]></content:encoded></item><item><title><![CDATA[The Harness Beats the Model: Demystifying Agent Harness Engineering]]></title><description><![CDATA[Over the past couple of years, I have spent most of my working hours building agent systems in production, multi-agent pipelines for enterprise clients, agentic platforms for startups, and educational content for a community of over 100,000 people learning to work with LLMs.]]></description><link>https://mlnotes.substack.com/p/the-harness-beats-the-model-demystifying</link><guid isPermaLink="false">https://mlnotes.substack.com/p/the-harness-beats-the-model-demystifying</guid><dc:creator><![CDATA[Mehdi Allahyari]]></dc:creator><pubDate>Mon, 01 Jun 2026 13:03:28 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!TgTw!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb770955-d791-43c0-91c6-24281c3d7bf0_1672x941.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!TgTw!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb770955-d791-43c0-91c6-24281c3d7bf0_1672x941.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!TgTw!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb770955-d791-43c0-91c6-24281c3d7bf0_1672x941.png 424w, https://substackcdn.com/image/fetch/$s_!TgTw!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb770955-d791-43c0-91c6-24281c3d7bf0_1672x941.png 848w, https://substackcdn.com/image/fetch/$s_!TgTw!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb770955-d791-43c0-91c6-24281c3d7bf0_1672x941.png 1272w, https://substackcdn.com/image/fetch/$s_!TgTw!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb770955-d791-43c0-91c6-24281c3d7bf0_1672x941.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!TgTw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb770955-d791-43c0-91c6-24281c3d7bf0_1672x941.png" width="1456" height="819" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fb770955-d791-43c0-91c6-24281c3d7bf0_1672x941.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:819,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2562975,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://mlnotes.substack.com/i/200045389?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb770955-d791-43c0-91c6-24281c3d7bf0_1672x941.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!TgTw!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb770955-d791-43c0-91c6-24281c3d7bf0_1672x941.png 424w, https://substackcdn.com/image/fetch/$s_!TgTw!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb770955-d791-43c0-91c6-24281c3d7bf0_1672x941.png 848w, https://substackcdn.com/image/fetch/$s_!TgTw!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb770955-d791-43c0-91c6-24281c3d7bf0_1672x941.png 1272w, https://substackcdn.com/image/fetch/$s_!TgTw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb770955-d791-43c0-91c6-24281c3d7bf0_1672x941.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Over the past couple of years, I have spent most of my working hours building agent systems in production, multi-agent pipelines for enterprise clients, agentic platforms for startups, and educational content for a community of over 100,000 people learning to work with LLMs. Across all of that work, one pattern kept showing up: the teams that shipped reliable, autonomous agents were not the ones with the biggest models. They were the ones who had thought hardest about the <em>infrastructure wrapped around</em> those models.</p><p>That observation is the starting point for this series. Over the next few posts, I want to share what I have learned from the ground up about <strong>agent harness engineering</strong>, the discipline of building the execution environment, safety rails, and feedback loops that turn a raw LLM into a system you can actually trust. Each post will move from concepts to concrete code, so whether you are just getting started with agents or already shipping them to production, there will be something here for you.</p><p>This is Part 1. Let&#8217;s start with the most important question: what even <em>is</em> a harness, and why does it matter more than the model inside it?</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://mlnotes.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">The MLnotes Newsletter is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p></p><div><hr></div><p>If you have been building with LLMs over the last couple of years, you have likely participated in two distinct waves of optimization:</p><ol><li><p><strong>Prompt Engineering:</strong> Writing carefully crafted instructions, system prompts, and few-shot examples to make a model output the correct format.</p></li><li><p><strong>Context Engineering:</strong> Building retrieval-augmented generation (RAG) pipelines, managing vector databases, and injecting relevant context at the exact right time.</p></li></ol><p>But as we pivot toward fully autonomous AI systems, agents that can write code, manage filesystems, and execute workflows over several hours, we are hitting the limits of prompts and context alone.</p><p>Today, a third wave has taken center stage: <strong>Harness Engineering</strong>.</p><p>The core realization of this era is simple: <strong>An LLM is not an agent.</strong> An LLM is a reasoning engine. To make that engine do useful work safely, repeatably, and autonomously, we must surround it with software.</p><p>In the developer community, a new rule of thumb has emerged: <em><strong>&#8220;If you&#8217;re not the model, you&#8217;re the harness.&#8221;</strong></em></p><p>In this post, we will demystify what an agent harness actually is, why it holds more leverage than the underlying model weights, and how you should think about building one.</p><div><hr></div><h3>The Metaphor: The Engine vs. The Car</h3><p>To understand why a harness is necessary, consider a high-performance sports car engine.</p><p>If you bolt a 600-horsepower engine to a wooden pallet in your garage and turn it on, you do not have a vehicle. You have a loud, vibrating hazard that will quickly burn through its fuel or damage itself. To make that engine useful, you need a chassis, a steering column, brakes, a dashboard, a fuel pump, and a seatbelt</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!1IGV!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6433934c-3cc8-4108-98aa-61a474beaefb_1408x768.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!1IGV!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6433934c-3cc8-4108-98aa-61a474beaefb_1408x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!1IGV!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6433934c-3cc8-4108-98aa-61a474beaefb_1408x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!1IGV!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6433934c-3cc8-4108-98aa-61a474beaefb_1408x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!1IGV!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6433934c-3cc8-4108-98aa-61a474beaefb_1408x768.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!1IGV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6433934c-3cc8-4108-98aa-61a474beaefb_1408x768.jpeg" width="1408" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6433934c-3cc8-4108-98aa-61a474beaefb_1408x768.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1408,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:919353,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://mlnotes.substack.com/i/200045389?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6433934c-3cc8-4108-98aa-61a474beaefb_1408x768.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!1IGV!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6433934c-3cc8-4108-98aa-61a474beaefb_1408x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!1IGV!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6433934c-3cc8-4108-98aa-61a474beaefb_1408x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!1IGV!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6433934c-3cc8-4108-98aa-61a474beaefb_1408x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!1IGV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6433934c-3cc8-4108-98aa-61a474beaefb_1408x768.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>In the AI agent stack:</p><ul><li><p><strong>The LLM</strong> is the engine. It provides raw reasoning, language processing, and planning capabilities.</p></li><li><p><strong>The Harness</strong> is the car. It is every piece of code, configuration, filesystem access, state management, and execution logic wrapped around that model.</p></li></ul><p>Without the harness, a model cannot safely write a file, execute a test, or recover from a compile error. The harness is what turns static, one-shot token generation into dynamic, continuous action.</p><div><hr></div><h3>What Lives Inside an Agent Harness?</h3><p>By definition, the harness is the entire software ecosystem built to constrain, guide, and support the LLM. If we look under the hood of modern agent runtimes (like Claude Code or LangChain&#8217;s Deep Agents), a standard harness typically provides several core services:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Om-I!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3af260b5-8064-4e49-a9c7-b162e7a3228d_1408x768.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Om-I!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3af260b5-8064-4e49-a9c7-b162e7a3228d_1408x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Om-I!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3af260b5-8064-4e49-a9c7-b162e7a3228d_1408x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Om-I!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3af260b5-8064-4e49-a9c7-b162e7a3228d_1408x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Om-I!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3af260b5-8064-4e49-a9c7-b162e7a3228d_1408x768.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Om-I!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3af260b5-8064-4e49-a9c7-b162e7a3228d_1408x768.jpeg" width="1408" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3af260b5-8064-4e49-a9c7-b162e7a3228d_1408x768.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1408,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1169270,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://mlnotes.substack.com/i/200045389?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3af260b5-8064-4e49-a9c7-b162e7a3228d_1408x768.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Om-I!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3af260b5-8064-4e49-a9c7-b162e7a3228d_1408x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!Om-I!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3af260b5-8064-4e49-a9c7-b162e7a3228d_1408x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!Om-I!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3af260b5-8064-4e49-a9c7-b162e7a3228d_1408x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!Om-I!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3af260b5-8064-4e49-a9c7-b162e7a3228d_1408x768.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><ol><li><p><strong>The Execution Loop:</strong> The basic state machine (<code>while active: run_step()</code>) that keeps the agent iterating toward its goal.</p></li><li><p><strong>Context Compaction:</strong> Programmatic logic that summarizes historical messages, archives oversized tool outputs, and manages the model&#8217;s memory limits over long-running sessions.</p></li><li><p><strong>The Sandbox:</strong> A secure, isolated runtime (like a Docker container or microVM) where the agent can run bash commands or execute code without compromising your local machine.</p></li><li><p><strong>Lifecycle Hooks:</strong> Code that intercepts the agent&#8217;s actions before or after they execute (e.g., checking code with a linter before running a compiler).</p></li><li><p><strong>Human-in-the-Loop Gates:</strong> Permission layers that halt execution and ask a human to approve potentially destructive commands (e.g., deleting a folder or executing a database write).</p></li></ol><div><hr></div><h3>The Core Paradigm: Feedforward vs. Feedback</h3><p>A useful way to think about how a harness works is to divide its responsibilities into two major systems: <strong>Feedforward Controls</strong> (Guides) and <strong>Feedback Controls</strong> (Sensors).</p><h4>1. Feedforward: Setting the Rails Before the Model Acts</h4><p>Feedforward mechanisms set the model up for success before it generates a single token. This includes:</p><ul><li><p><strong>System Prompt Assembly:</strong> Dynamically constructing the prompt based on the agent&#8217;s current workspace.</p></li><li><p><strong>Project-Level Rules:</strong> Injecting repository-specific guidelines (often read from files like <code>CLAUDE.md</code> or <code>AGENTS.md</code>) directly into the agent&#8217;s memory to prevent common style or architectural mistakes.</p></li></ul><h4>2. Feedback: Reacting After the Model Acts</h4><p>Feedback mechanisms are where the harness truly shines. Instead of forcing a human to read every error, the harness acts as a sensor.</p><ul><li><p><strong>The Self-Correction Loop:</strong> If the agent writes bad code, the harness attempts to compile it, catches the syntax error, and passes that error back to the model as a new message: <em>&#8220;Here is the compiler output; please fix this syntax error.&#8221;</em></p></li><li><p>By automating this &#8220;sensor-loop&#8221;, the agent corrects its own mistakes in the background, only surfacing to the human user once a clean, verified solution is found.</p></li></ul><div><hr></div><h3>Harness vs. Orchestration Framework: What&#8217;s the Difference?</h3><p>A common point of confusion is how a harness differs from existing orchestration libraries like LangChain or LangGraph.</p><ul><li><p><strong>An Orchestration Framework</strong> is a toolbox. It provides the building blocks&#8212;the graph runtimes, the token counters, the schema definitions&#8212;needed to connect models to code.</p></li><li><p><strong>An Agent Harness</strong> is an opinionated, batteries-included application wrapper built <em>with</em> those blocks. It dictates exactly how files are read, how sandboxes are managed, how memory is pruned, and how users approve tasks.</p></li></ul><p>You use a framework to build a harness. <em>Your harness is the custom, end-to-end environment that handles your business logic.</em></p><div><hr></div><h3>The Real Leverage: The Harness Beats the Model</h3><p>When an agent fails a complex coding benchmark, developers are often tempted to blame the underlying model. The standard response is to wait for the next, larger model to release.</p><p>However, the practical evidence points in a clear direction: <strong>tuning the harness is often more effective than upgrading the model.</strong></p><p>The SWE-bench leaderboard, the industry benchmark for autonomous coding agents, illustrates this well. Teams that reached the top did not do it by swapping in a bigger model. They did it by obsessing over the harness: how errors were formatted before being fed back, how long context was pruned, how the agent was given permission to retry. The same model weights, run inside a smarter harness, produced dramatically better results.</p><p>This is the <strong>Harness Engineering Mindset</strong>. Every time an agent makes a mistake, do not just tweak your system prompt. Instead, ask: <em>What sensor, sandbox rule, or execution hook could I add to my harness so the system prevents or self-corrects this mistake next time?</em></p><div><hr></div><h3>What&#8217;s Next?</h3><p>Now that we have established the &#8220;what&#8221; and &#8220;why&#8221; of the agent harness, it&#8217;s time to look at how these systems are built.</p><p>In <strong>Part 2</strong> of this series, we will zoom in and examine the <strong>Anatomy of a Modern Agent Harness</strong>. We will sketch out a minimal Python implementation of an agent loop, look at how to build a simple tool registry, and write code that captures execution errors to feed them back into the model.</p><p><em>Have you started building custom wrappers around your LLMs? What constraints or feedback loops have made the biggest difference in your projects? Let&#8217;s discuss in the comments below.</em></p><div class="captioned-button-wrap" data-attrs="{&quot;url&quot;:&quot;https://mlnotes.substack.com/p/the-harness-beats-the-model-demystifying?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="CaptionedButtonToDOM"><div class="preamble"><p class="cta-caption">Thanks for reading The MLnotes Newsletter! This post is public so feel free to share it.</p></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://mlnotes.substack.com/p/the-harness-beats-the-model-demystifying?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://mlnotes.substack.com/p/the-harness-beats-the-model-demystifying?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p></div>]]></content:encoded></item><item><title><![CDATA[Real-time AI captions for any video in your browser]]></title><description><![CDATA[I was watching a news broadcast being translated live on screen and had one thought: why isn&#8217;t this just built into the browser?]]></description><link>https://mlnotes.substack.com/p/real-time-ai-captions-for-any-video</link><guid isPermaLink="false">https://mlnotes.substack.com/p/real-time-ai-captions-for-any-video</guid><dc:creator><![CDATA[Mehdi Allahyari]]></dc:creator><pubDate>Wed, 20 May 2026 13:00:44 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!slz4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb60ccb3-f6a9-4dc1-a8ba-328e37d5ca51_800x445.gif" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!slz4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb60ccb3-f6a9-4dc1-a8ba-328e37d5ca51_800x445.gif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!slz4!,w_424,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb60ccb3-f6a9-4dc1-a8ba-328e37d5ca51_800x445.gif 424w, https://substackcdn.com/image/fetch/$s_!slz4!,w_848,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb60ccb3-f6a9-4dc1-a8ba-328e37d5ca51_800x445.gif 848w, https://substackcdn.com/image/fetch/$s_!slz4!,w_1272,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb60ccb3-f6a9-4dc1-a8ba-328e37d5ca51_800x445.gif 1272w, https://substackcdn.com/image/fetch/$s_!slz4!,w_1456,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb60ccb3-f6a9-4dc1-a8ba-328e37d5ca51_800x445.gif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!slz4!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb60ccb3-f6a9-4dc1-a8ba-328e37d5ca51_800x445.gif" width="800" height="445" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/db60ccb3-f6a9-4dc1-a8ba-328e37d5ca51_800x445.gif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:445,&quot;width&quot;:800,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:9791551,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/gif&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://mlnotes.substack.com/i/198503835?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb60ccb3-f6a9-4dc1-a8ba-328e37d5ca51_800x445.gif&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!slz4!,w_424,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb60ccb3-f6a9-4dc1-a8ba-328e37d5ca51_800x445.gif 424w, https://substackcdn.com/image/fetch/$s_!slz4!,w_848,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb60ccb3-f6a9-4dc1-a8ba-328e37d5ca51_800x445.gif 848w, https://substackcdn.com/image/fetch/$s_!slz4!,w_1272,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb60ccb3-f6a9-4dc1-a8ba-328e37d5ca51_800x445.gif 1272w, https://substackcdn.com/image/fetch/$s_!slz4!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb60ccb3-f6a9-4dc1-a8ba-328e37d5ca51_800x445.gif 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I was watching a news broadcast being translated live on screen and had one thought: why isn&#8217;t this just built into the browser?</p><p>The technology clearly exists. Transcription, translation, real-time, all of it. But using it meant switching apps, copying text, losing the thread of what you were watching.</p><p>I started thinking about who actually needs this. Language learners trying to follow native content. Students watching university lectures in a second language. Anyone who&#8217;s ever given up on a video because the captions were too broken to follow.</p><p>So I built it.</p><div><hr></div><p><strong>Overline</strong> is a Chrome extension that overlays real-time AI captions and translation directly on any video you&#8217;re watching in your browser. YouTube, news sites, university lecture portals, anything with video.</p><p>You click Start. Captions appear on the video itself, in your language, in real time.</p><p>No switching tabs. No copy-pasting. No lag that breaks your focus. Just the video, with understanding layered on top.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Xsi9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F604ccf61-aaad-4e52-a7c7-3cd415248cef_2527x1580.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Xsi9!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F604ccf61-aaad-4e52-a7c7-3cd415248cef_2527x1580.png 424w, https://substackcdn.com/image/fetch/$s_!Xsi9!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F604ccf61-aaad-4e52-a7c7-3cd415248cef_2527x1580.png 848w, https://substackcdn.com/image/fetch/$s_!Xsi9!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F604ccf61-aaad-4e52-a7c7-3cd415248cef_2527x1580.png 1272w, https://substackcdn.com/image/fetch/$s_!Xsi9!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F604ccf61-aaad-4e52-a7c7-3cd415248cef_2527x1580.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Xsi9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F604ccf61-aaad-4e52-a7c7-3cd415248cef_2527x1580.png" width="1456" height="910" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/604ccf61-aaad-4e52-a7c7-3cd415248cef_2527x1580.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:910,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:3007658,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://mlnotes.substack.com/i/198503835?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F604ccf61-aaad-4e52-a7c7-3cd415248cef_2527x1580.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Xsi9!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F604ccf61-aaad-4e52-a7c7-3cd415248cef_2527x1580.png 424w, https://substackcdn.com/image/fetch/$s_!Xsi9!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F604ccf61-aaad-4e52-a7c7-3cd415248cef_2527x1580.png 848w, https://substackcdn.com/image/fetch/$s_!Xsi9!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F604ccf61-aaad-4e52-a7c7-3cd415248cef_2527x1580.png 1272w, https://substackcdn.com/image/fetch/$s_!Xsi9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F604ccf61-aaad-4e52-a7c7-3cd415248cef_2527x1580.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><div><hr></div><p><strong>Who it&#8217;s for:</strong></p><ul><li><p>Language learners who want to watch native content without losing the thread</p></li><li><p>Students accessing lectures or educational videos in a foreign language</p></li><li><p>Anyone who&#8217;s ever given up on a video because the captions were too bad to follow</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ssdD!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe70c62ff-8e59-4f9e-9511-4a8d88d64dcc_1280x800.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ssdD!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe70c62ff-8e59-4f9e-9511-4a8d88d64dcc_1280x800.png 424w, https://substackcdn.com/image/fetch/$s_!ssdD!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe70c62ff-8e59-4f9e-9511-4a8d88d64dcc_1280x800.png 848w, https://substackcdn.com/image/fetch/$s_!ssdD!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe70c62ff-8e59-4f9e-9511-4a8d88d64dcc_1280x800.png 1272w, https://substackcdn.com/image/fetch/$s_!ssdD!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe70c62ff-8e59-4f9e-9511-4a8d88d64dcc_1280x800.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ssdD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe70c62ff-8e59-4f9e-9511-4a8d88d64dcc_1280x800.png" width="1280" height="800" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e70c62ff-8e59-4f9e-9511-4a8d88d64dcc_1280x800.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:800,&quot;width&quot;:1280,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1014946,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://mlnotes.substack.com/i/198503835?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe70c62ff-8e59-4f9e-9511-4a8d88d64dcc_1280x800.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ssdD!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe70c62ff-8e59-4f9e-9511-4a8d88d64dcc_1280x800.png 424w, https://substackcdn.com/image/fetch/$s_!ssdD!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe70c62ff-8e59-4f9e-9511-4a8d88d64dcc_1280x800.png 848w, https://substackcdn.com/image/fetch/$s_!ssdD!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe70c62ff-8e59-4f9e-9511-4a8d88d64dcc_1280x800.png 1272w, https://substackcdn.com/image/fetch/$s_!ssdD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe70c62ff-8e59-4f9e-9511-4a8d88d64dcc_1280x800.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div><hr></div><p><strong>How to try it:</strong></p><p>Overline is free and available now on the Chrome Web Store. Install it, sign in, pick your target language, and click Start Captions on any video page.</p><p><a href="https://chromewebstore.google.com/detail/overline/aioddiojagipbjjbgndpjkdlnmlgankh">Install Overline &#8594;</a></p><div><hr></div><h2>How it works (for the curious)</h2><p>Building this taught me a few interesting things about what&#8217;s actually possible inside a Chrome extension.</p><p>The high-level architecture looks like this:</p><p>The flow has three stages:</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!MCAY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3f4c59c-17fe-4efa-acb7-f8cd9b615823_1948x392.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!MCAY!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3f4c59c-17fe-4efa-acb7-f8cd9b615823_1948x392.png 424w, https://substackcdn.com/image/fetch/$s_!MCAY!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3f4c59c-17fe-4efa-acb7-f8cd9b615823_1948x392.png 848w, https://substackcdn.com/image/fetch/$s_!MCAY!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3f4c59c-17fe-4efa-acb7-f8cd9b615823_1948x392.png 1272w, https://substackcdn.com/image/fetch/$s_!MCAY!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3f4c59c-17fe-4efa-acb7-f8cd9b615823_1948x392.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!MCAY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3f4c59c-17fe-4efa-acb7-f8cd9b615823_1948x392.png" width="1456" height="293" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f3f4c59c-17fe-4efa-acb7-f8cd9b615823_1948x392.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:293,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:59899,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://mlnotes.substack.com/i/198503835?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3f4c59c-17fe-4efa-acb7-f8cd9b615823_1948x392.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!MCAY!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3f4c59c-17fe-4efa-acb7-f8cd9b615823_1948x392.png 424w, https://substackcdn.com/image/fetch/$s_!MCAY!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3f4c59c-17fe-4efa-acb7-f8cd9b615823_1948x392.png 848w, https://substackcdn.com/image/fetch/$s_!MCAY!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3f4c59c-17fe-4efa-acb7-f8cd9b615823_1948x392.png 1272w, https://substackcdn.com/image/fetch/$s_!MCAY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3f4c59c-17fe-4efa-acb7-f8cd9b615823_1948x392.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p><strong>1. Capture</strong></p><p>The extension captures your tab&#8217;s audio directly, the video playing in Chrome, without touching your microphone. This is done through a Chrome API designed specifically for tab audio, routed through a background process that handles the stream.</p><p><strong>2. Transcribe &amp; Translate</strong></p><p>Audio is streamed in real time to a backend that pipes it through a speech-to-text API, then optionally through an LLM for translation. The system distinguishes between partial (in-progress) transcripts and finalized ones, which is what gives the captions their live, flowing feel rather than appearing in sudden chunks.</p><p><strong>3. Overlay</strong></p><p>The caption text is injected directly into the tab as an overlay on the video, no separate window, no UI clutter. It appears where you&#8217;re already looking.</p><p>Here&#8217;s the sequence of a single caption appearing from the moment audio leaves your browser:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!BfVK!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdaf9f202-814d-4f30-a9d4-8b6a532ef465_1796x1130.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!BfVK!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdaf9f202-814d-4f30-a9d4-8b6a532ef465_1796x1130.png 424w, https://substackcdn.com/image/fetch/$s_!BfVK!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdaf9f202-814d-4f30-a9d4-8b6a532ef465_1796x1130.png 848w, https://substackcdn.com/image/fetch/$s_!BfVK!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdaf9f202-814d-4f30-a9d4-8b6a532ef465_1796x1130.png 1272w, https://substackcdn.com/image/fetch/$s_!BfVK!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdaf9f202-814d-4f30-a9d4-8b6a532ef465_1796x1130.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!BfVK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdaf9f202-814d-4f30-a9d4-8b6a532ef465_1796x1130.png" width="1456" height="916" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/daf9f202-814d-4f30-a9d4-8b6a532ef465_1796x1130.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:916,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:155314,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://mlnotes.substack.com/i/198503835?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdaf9f202-814d-4f30-a9d4-8b6a532ef465_1796x1130.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!BfVK!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdaf9f202-814d-4f30-a9d4-8b6a532ef465_1796x1130.png 424w, https://substackcdn.com/image/fetch/$s_!BfVK!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdaf9f202-814d-4f30-a9d4-8b6a532ef465_1796x1130.png 848w, https://substackcdn.com/image/fetch/$s_!BfVK!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdaf9f202-814d-4f30-a9d4-8b6a532ef465_1796x1130.png 1272w, https://substackcdn.com/image/fetch/$s_!BfVK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdaf9f202-814d-4f30-a9d4-8b6a532ef465_1796x1130.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The whole round trip, audio out, caption back, typically lands under a second</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!AGBW!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F094ffd4d-fa49-4012-a796-a09dbf47b9d9_1280x800.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!AGBW!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F094ffd4d-fa49-4012-a796-a09dbf47b9d9_1280x800.png 424w, https://substackcdn.com/image/fetch/$s_!AGBW!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F094ffd4d-fa49-4012-a796-a09dbf47b9d9_1280x800.png 848w, https://substackcdn.com/image/fetch/$s_!AGBW!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F094ffd4d-fa49-4012-a796-a09dbf47b9d9_1280x800.png 1272w, https://substackcdn.com/image/fetch/$s_!AGBW!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F094ffd4d-fa49-4012-a796-a09dbf47b9d9_1280x800.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!AGBW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F094ffd4d-fa49-4012-a796-a09dbf47b9d9_1280x800.png" width="1280" height="800" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/094ffd4d-fa49-4012-a796-a09dbf47b9d9_1280x800.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:800,&quot;width&quot;:1280,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:420327,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://mlnotes.substack.com/i/198503835?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F094ffd4d-fa49-4012-a796-a09dbf47b9d9_1280x800.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!AGBW!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F094ffd4d-fa49-4012-a796-a09dbf47b9d9_1280x800.png 424w, https://substackcdn.com/image/fetch/$s_!AGBW!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F094ffd4d-fa49-4012-a796-a09dbf47b9d9_1280x800.png 848w, https://substackcdn.com/image/fetch/$s_!AGBW!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F094ffd4d-fa49-4012-a796-a09dbf47b9d9_1280x800.png 1272w, https://substackcdn.com/image/fetch/$s_!AGBW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F094ffd4d-fa49-4012-a796-a09dbf47b9d9_1280x800.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The extension is built with WXT (a TypeScript-first framework for Chrome extensions), React for the popup, and a Python + FastAPI backend. Auth is handled by Clerk.</p><div><hr></div><p>This is v1. I&#8217;m actively working on making it faster, supporting more languages, and reducing setup friction. If you try it, reply here or reach out directly, I&#8217;d genuinely love to know what&#8217;s working and what isn&#8217;t.</p><p><a href="https://chromewebstore.google.com/detail/overline/aioddiojagipbjjbgndpjkdlnmlgankh">Install Overline on the Chrome Web Store</a></p>]]></content:encoded></item><item><title><![CDATA[Slop, Speed, and Discipline: Hard Truths About AI Coding Agents]]></title><description><![CDATA[Originally presented at AI Engineer conference by Mario Zechner, creator of PI, a minimal, extensible coding agent]]></description><link>https://mlnotes.substack.com/p/slop-speed-and-discipline-hard-truths</link><guid isPermaLink="false">https://mlnotes.substack.com/p/slop-speed-and-discipline-hard-truths</guid><dc:creator><![CDATA[Mehdi Allahyari]]></dc:creator><pubDate>Thu, 23 Apr 2026 14:00:51 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Nlup!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8710e3c5-5662-40ce-bd1c-e6a8b1788c24_3686x1630.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Nlup!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8710e3c5-5662-40ce-bd1c-e6a8b1788c24_3686x1630.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Nlup!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8710e3c5-5662-40ce-bd1c-e6a8b1788c24_3686x1630.png 424w, https://substackcdn.com/image/fetch/$s_!Nlup!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8710e3c5-5662-40ce-bd1c-e6a8b1788c24_3686x1630.png 848w, https://substackcdn.com/image/fetch/$s_!Nlup!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8710e3c5-5662-40ce-bd1c-e6a8b1788c24_3686x1630.png 1272w, https://substackcdn.com/image/fetch/$s_!Nlup!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8710e3c5-5662-40ce-bd1c-e6a8b1788c24_3686x1630.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Nlup!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8710e3c5-5662-40ce-bd1c-e6a8b1788c24_3686x1630.png" width="1456" height="644" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8710e3c5-5662-40ce-bd1c-e6a8b1788c24_3686x1630.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:644,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1977994,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://mlnotes.substack.com/i/195183095?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8710e3c5-5662-40ce-bd1c-e6a8b1788c24_3686x1630.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!Nlup!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8710e3c5-5662-40ce-bd1c-e6a8b1788c24_3686x1630.png 424w, https://substackcdn.com/image/fetch/$s_!Nlup!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8710e3c5-5662-40ce-bd1c-e6a8b1788c24_3686x1630.png 848w, https://substackcdn.com/image/fetch/$s_!Nlup!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8710e3c5-5662-40ce-bd1c-e6a8b1788c24_3686x1630.png 1272w, https://substackcdn.com/image/fetch/$s_!Nlup!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8710e3c5-5662-40ce-bd1c-e6a8b1788c24_3686x1630.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>There&#8217;s a story I keep hearing in the ML community right now, and it goes something like this: a team spins up a fleet of coding agents, ships features at 10x speed, and celebrates. Two months later, nobody on the team understands the codebase anymore. Tests pass but production breaks. The agents are called in to fix the mess, but the codebase is now so large, so tangled, and so far beyond anyone&#8217;s context window that even the agents can&#8217;t get their bearings.</p><p>This isn&#8217;t a hypothetical. Mario Zechner, an independent software developer, who just shipped PI, his own coding agent harness, watched this pattern emerge and decided to say something about it out loud. His talk, <em>&#8220;Building PI in a World of Slop,&#8221;</em> is one of the most honest and technically grounded takes I&#8217;ve seen on the current state of AI coding agents. I agree with nearly everything he says. Here&#8217;s my take from his talk.</p><h2>Why a Veteran Developer Walked Away from Claude Code</h2><p>Mario didn&#8217;t build PI out of spite. He started using Claude Code in April 2025, loved it, and credits the Anthropic team as genuinely talented. But over time, something shifted. The tool grew faster than it improved.</p><p>His specific complaints are worth naming:</p><ul><li><p><strong>Context he couldn&#8217;t control.</strong> Claude Code manages the context window on your behalf, and it does things behind your back. System prompts change on every release, including tool definitions. Tools get removed or modified without notice.</p></li><li><p><strong>System reminders injected at the worst moments.</strong> The tool would insert reminders into the context mid-task, sometimes with language like &#8220;this may or may not be relevant to what you&#8217;re doing&#8221;, which, as Mario notes, is exactly the kind of hedging that confuses a language model trying to maintain coherence.</p></li><li><p><strong>Zero observability.</strong> You can&#8217;t see what your agent is doing or why. For anyone who cares about understanding their systems, this is a serious problem.</p></li><li><p><strong>Shallow extensibility.</strong> Hooks exist, but they&#8217;re process-level, a new shell command spawned on each trigger. Not deep integration. Not programmable.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!7tH_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c119971-17bd-40d0-9548-3f0af6c9e0e7_1792x1116.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!7tH_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c119971-17bd-40d0-9548-3f0af6c9e0e7_1792x1116.png 424w, https://substackcdn.com/image/fetch/$s_!7tH_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c119971-17bd-40d0-9548-3f0af6c9e0e7_1792x1116.png 848w, https://substackcdn.com/image/fetch/$s_!7tH_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c119971-17bd-40d0-9548-3f0af6c9e0e7_1792x1116.png 1272w, https://substackcdn.com/image/fetch/$s_!7tH_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c119971-17bd-40d0-9548-3f0af6c9e0e7_1792x1116.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!7tH_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c119971-17bd-40d0-9548-3f0af6c9e0e7_1792x1116.png" width="1456" height="907" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0c119971-17bd-40d0-9548-3f0af6c9e0e7_1792x1116.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:907,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1549077,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://mlnotes.substack.com/i/195183095?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c119971-17bd-40d0-9548-3f0af6c9e0e7_1792x1116.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!7tH_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c119971-17bd-40d0-9548-3f0af6c9e0e7_1792x1116.png 424w, https://substackcdn.com/image/fetch/$s_!7tH_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c119971-17bd-40d0-9548-3f0af6c9e0e7_1792x1116.png 848w, https://substackcdn.com/image/fetch/$s_!7tH_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c119971-17bd-40d0-9548-3f0af6c9e0e7_1792x1116.png 1272w, https://substackcdn.com/image/fetch/$s_!7tH_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c119971-17bd-40d0-9548-3f0af6c9e0e7_1792x1116.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>These aren&#8217;t complaints about Anthropic&#8217;s intentions. They&#8217;re complaints about what happens when a tool optimizes for feature velocity at the cost of transparency and stability. Mario&#8217;s analogy is perfect: if your hammer breaks every day on a construction site, you get mad. Developer tools are no different.</p><p>He looked at alternatives, shoutout to AMP and Factory Droid as the &#8220;Porsche and Lamborghini&#8221; of coding harnesses, and also dug into OpenCode&#8217;s internals, where he found similar issues: context pruning that effectively lobotomizes the model, LSP error injection mid-edit (checking for errors after every line, not after you finish your work), and a server that by default exposes itself to any website open in your browser.</p><p>So he built PI. And what he built teaches us something important.</p><div><hr></div><h2>Big Idea #1: Minimal Beats Maximal</h2><p>PI ships with four tools. Its system prompt fits on a single slide. That&#8217;s the whole thing.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!bHeU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F74f256a6-efa8-4dea-9540-b52de02b8f2d_820x700.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!bHeU!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F74f256a6-efa8-4dea-9540-b52de02b8f2d_820x700.png 424w, https://substackcdn.com/image/fetch/$s_!bHeU!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F74f256a6-efa8-4dea-9540-b52de02b8f2d_820x700.png 848w, https://substackcdn.com/image/fetch/$s_!bHeU!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F74f256a6-efa8-4dea-9540-b52de02b8f2d_820x700.png 1272w, https://substackcdn.com/image/fetch/$s_!bHeU!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F74f256a6-efa8-4dea-9540-b52de02b8f2d_820x700.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!bHeU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F74f256a6-efa8-4dea-9540-b52de02b8f2d_820x700.png" width="820" height="700" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/74f256a6-efa8-4dea-9540-b52de02b8f2d_820x700.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:700,&quot;width&quot;:820,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:350337,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://mlnotes.substack.com/i/195183095?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F74f256a6-efa8-4dea-9540-b52de02b8f2d_820x700.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!bHeU!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F74f256a6-efa8-4dea-9540-b52de02b8f2d_820x700.png 424w, https://substackcdn.com/image/fetch/$s_!bHeU!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F74f256a6-efa8-4dea-9540-b52de02b8f2d_820x700.png 848w, https://substackcdn.com/image/fetch/$s_!bHeU!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F74f256a6-efa8-4dea-9540-b52de02b8f2d_820x700.png 1272w, https://substackcdn.com/image/fetch/$s_!bHeU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F74f256a6-efa8-4dea-9540-b52de02b8f2d_820x700.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Before you dismiss this as toy-project minimalism, consider Terminal Bench, a coding agent benchmark that gives the model exactly one capability: send keystrokes to a tmux session and read the output. No file tools. No sub-agents. No search. Just a terminal. As of December 2025, Terminal Bench scores higher on the leaderboard than most full-featured harnesses, including native model harnesses. The simplest possible environment outperforms the most feature-rich ones.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!kTuH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c614f03-f368-4f4a-a263-d6dbacc1bbfd_1440x1136.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!kTuH!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c614f03-f368-4f4a-a263-d6dbacc1bbfd_1440x1136.png 424w, https://substackcdn.com/image/fetch/$s_!kTuH!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c614f03-f368-4f4a-a263-d6dbacc1bbfd_1440x1136.png 848w, https://substackcdn.com/image/fetch/$s_!kTuH!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c614f03-f368-4f4a-a263-d6dbacc1bbfd_1440x1136.png 1272w, https://substackcdn.com/image/fetch/$s_!kTuH!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c614f03-f368-4f4a-a263-d6dbacc1bbfd_1440x1136.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!kTuH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c614f03-f368-4f4a-a263-d6dbacc1bbfd_1440x1136.png" width="1440" height="1136" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3c614f03-f368-4f4a-a263-d6dbacc1bbfd_1440x1136.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1136,&quot;width&quot;:1440,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:440871,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://mlnotes.substack.com/i/195183095?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c614f03-f368-4f4a-a263-d6dbacc1bbfd_1440x1136.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!kTuH!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c614f03-f368-4f4a-a263-d6dbacc1bbfd_1440x1136.png 424w, https://substackcdn.com/image/fetch/$s_!kTuH!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c614f03-f368-4f4a-a263-d6dbacc1bbfd_1440x1136.png 848w, https://substackcdn.com/image/fetch/$s_!kTuH!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c614f03-f368-4f4a-a263-d6dbacc1bbfd_1440x1136.png 1272w, https://substackcdn.com/image/fetch/$s_!kTuH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3c614f03-f368-4f4a-a263-d6dbacc1bbfd_1440x1136.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This should give us pause.</p><p>We&#8217;ve been operating under an assumption that more context management, more tools, more agentic scaffolding equals better performance. Mario&#8217;s argument, backed by benchmark data, is that we&#8217;re in what he calls the &#8220;fuck around and find out&#8221; phase of coding agents. We don&#8217;t actually know what the right harness looks like yet. And in that uncertainty, the minimal approach wins because it introduces fewer confounding variables, gives the model cleaner signal, and is easier to reason about.</p><p>This maps directly to something ML practitioners already know: simpler models with strong inductive biases consistently outperform overparameterized ones in low-data regimes. The principle holds in agent design too. A model that&#8217;s been RLHF-trained extensively on coding tasks already knows what a coding agent is, you don&#8217;t need 10,000 tokens to remind it. It knows because it was trained to be one.</p><p>PI&#8217;s extensibility philosophy reinforces this: instead of shipping every feature, ship a minimal core and let the agent extend itself. Users describe what they need; PI builds the extension. The agent adapts to your workflow, not the other way around.</p><div><hr></div><h2>Big Idea #2: Agent Velocity Without Discipline Is Technical Debt on Steroids</h2><p>This is the part of Mario&#8217;s talk that I think every ML team lead and practitioner needs to hear.</p><p>Agents compound errors. Mario calls these &#8220;boooos&#8221;, mistakes that accumulate silently, without the friction that would normally alert a human developer. A human feels pain. They get confused, frustrated, they slow down, they refactor. Agents don&#8217;t feel pain. They will happily keep generating code into a broken codebase indefinitely.</p><p>The math here is brutal. With one human developer, you have a natural bottleneck on how many errors can enter the codebase per day. Add ten agents running in parallel, and that bottleneck disappears. The error rate scales. The review burden scales with it, but your human capacity to review does not.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!bjEI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37e76d39-14c0-4e5c-a4d8-d0a27c3c8cbe_1350x866.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!bjEI!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37e76d39-14c0-4e5c-a4d8-d0a27c3c8cbe_1350x866.png 424w, https://substackcdn.com/image/fetch/$s_!bjEI!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37e76d39-14c0-4e5c-a4d8-d0a27c3c8cbe_1350x866.png 848w, https://substackcdn.com/image/fetch/$s_!bjEI!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37e76d39-14c0-4e5c-a4d8-d0a27c3c8cbe_1350x866.png 1272w, https://substackcdn.com/image/fetch/$s_!bjEI!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37e76d39-14c0-4e5c-a4d8-d0a27c3c8cbe_1350x866.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!bjEI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37e76d39-14c0-4e5c-a4d8-d0a27c3c8cbe_1350x866.png" width="1350" height="866" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/37e76d39-14c0-4e5c-a4d8-d0a27c3c8cbe_1350x866.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:866,&quot;width&quot;:1350,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:526177,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://mlnotes.substack.com/i/195183095?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37e76d39-14c0-4e5c-a4d8-d0a27c3c8cbe_1350x866.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!bjEI!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37e76d39-14c0-4e5c-a4d8-d0a27c3c8cbe_1350x866.png 424w, https://substackcdn.com/image/fetch/$s_!bjEI!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37e76d39-14c0-4e5c-a4d8-d0a27c3c8cbe_1350x866.png 848w, https://substackcdn.com/image/fetch/$s_!bjEI!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37e76d39-14c0-4e5c-a4d8-d0a27c3c8cbe_1350x866.png 1272w, https://substackcdn.com/image/fetch/$s_!bjEI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F37e76d39-14c0-4e5c-a4d8-d0a27c3c8cbe_1350x866.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>And here&#8217;s the structural problem with what agents learn: they were trained on the internet, which is overwhelmingly our old, mediocre code. Not the pearls, the garbage. Every blank left in a spec gets filled in by the agent using patterns learned from that garbage. You get abstraction layers nobody asked for, duplicated logic, backwards-compatibility shims for scenarios that don&#8217;t exist. Enterprise-grade complexity generated in two weeks by two humans and ten agents. Congratulations.</p><p>&#8220;But we have a detailed spec.&#8221; Mario&#8217;s response to this is worth quoting directly: <em>a sufficiently detailed spec is a program.</em> If there are gaps in your spec, the model fills them. You don&#8217;t get to control what it fills them with.</p><p>&#8220;But we have a review agent.&#8221; Also not enough. Review agents catch some issues. They miss the architectural ones, the decisions that seem locally reasonable but cause global damage, because those require understanding the whole system, which is exactly what gets lost when no human is reading the code.</p><p>Mario&#8217;s framework for responsible agent use is practical and I endorse it:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!k-sK!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb60903af-11cf-4cc2-8dcd-f6d39994c397_1284x454.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!k-sK!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb60903af-11cf-4cc2-8dcd-f6d39994c397_1284x454.png 424w, https://substackcdn.com/image/fetch/$s_!k-sK!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb60903af-11cf-4cc2-8dcd-f6d39994c397_1284x454.png 848w, https://substackcdn.com/image/fetch/$s_!k-sK!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb60903af-11cf-4cc2-8dcd-f6d39994c397_1284x454.png 1272w, https://substackcdn.com/image/fetch/$s_!k-sK!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb60903af-11cf-4cc2-8dcd-f6d39994c397_1284x454.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!k-sK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb60903af-11cf-4cc2-8dcd-f6d39994c397_1284x454.png" width="1284" height="454" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b60903af-11cf-4cc2-8dcd-f6d39994c397_1284x454.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:454,&quot;width&quot;:1284,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:327653,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://mlnotes.substack.com/i/195183095?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb60903af-11cf-4cc2-8dcd-f6d39994c397_1284x454.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!k-sK!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb60903af-11cf-4cc2-8dcd-f6d39994c397_1284x454.png 424w, https://substackcdn.com/image/fetch/$s_!k-sK!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb60903af-11cf-4cc2-8dcd-f6d39994c397_1284x454.png 848w, https://substackcdn.com/image/fetch/$s_!k-sK!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb60903af-11cf-4cc2-8dcd-f6d39994c397_1284x454.png 1272w, https://substackcdn.com/image/fetch/$s_!k-sK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb60903af-11cf-4cc2-8dcd-f6d39994c397_1284x454.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><ul><li><p><strong>Scope tightly.</strong> Give the agent only the context it needs and guarantee it can find everything relevant to the task. If you can&#8217;t scope it, don&#8217;t delegate it.</p></li><li><p><strong>Modularize your codebase.</strong> Smaller, well-bounded modules make agentic tasks tractable and reviewable.</p></li><li><p><strong>Use agents for the right tasks.</strong> Reproduction cases for bugs, boring boilerplate, non-critical automation, research, these are good. Core business logic, security-sensitive code, architectural decisions, these are not.</p></li><li><p><strong>Read the critical lines yourself.</strong> Every one of them. If you don&#8217;t know what&#8217;s critical, that&#8217;s your answer: read more code.</p></li></ul><p>The last point is uncomfortable, and I think that&#8217;s the point. The discipline to read your own code is not a tax on productivity, it&#8217;s what makes you the developer rather than the reviewer of whatever the agent decided.</p><div><hr></div><h2>The OSS Problem: Clankers Are Degrading the Commons</h2><p>Briefly, because it deserves mention: AI-generated GitHub issues and pull requests are burning out open source maintainers. Mario showed his own tracker, half of it is garbage from agent instances running autonomously, submitting PRs without reading the contribution guidelines, flooding issue queues with low-quality reports.</p><p>His response, auto-closing agent PRs, asking contributors to write issues in their own human voice under a screen&#8217;s worth of text, building a filter that only lets through accounts that actually read the instructions, is clever. But the fact that he has to spend time building abuse filters for his own open source project is a problem the community created.</p><p>If you&#8217;re using agents to contribute to OSS, please configure them with appropriate constraints. Or better yet, read the repo and write the issue yourself.</p><div><hr></div><h2>Slow Down. Build Less. Understand More.</h2><p>Mario ends his talk with something that sounds almost contrarian in 2025: slow down. Think about what you&#8217;re building and why. Learn to say no. Fewer features, but the ones that matter, and then use your agents to polish those features until they&#8217;re excellent.</p><p>I think he&#8217;s right. The race to maximize agent output is a race to maximize technical debt, user confusion, and maintainer burnout. The practitioners who will build the most durable, trustworthy AI-powered systems are the ones who treat agents as precision instruments, not autonomous co-developers.</p><p>Mario&#8217;s closing line is the one I keep coming back to:</p><blockquote><p><em>&#8220;Friction is the thing that builds understanding of the system in your head, and it&#8217;s also where you learn new things.&#8221;</em></p></blockquote><p>We&#8217;ve been treating friction as the enemy. It&#8217;s not. It&#8217;s the mechanism by which understanding transfers from code to developer. Remove it entirely and you end up with a codebase you can&#8217;t debug, can&#8217;t extend, and can&#8217;t trust.</p><p>Use your agents. Use them well. But keep your hands in the code.</p><div><hr></div><p><em>Mario Zechner&#8217;s talk &#8220;Building PI in a World of Slop&#8221; was presented at <a href="https://www.youtube.com/watch?v=RjfbvDXpFls">AI Engineer conference</a>. PI is open source, you can find it on <a href="https://github.com/badlogic/pi-mono">GitHub</a>. If you found this useful, forward it to a colleague who&#8217;s been talking about going &#8220;full agents.&#8221;</em></p>]]></content:encoded></item><item><title><![CDATA[Knowledge Graphs Needed an AI Layer. So I Built One]]></title><description><![CDATA[I&#8217;ve been working with semantic web technologies since 2012.]]></description><link>https://mlnotes.substack.com/p/knowledge-graphs-needed-an-ai-layer</link><guid isPermaLink="false">https://mlnotes.substack.com/p/knowledge-graphs-needed-an-ai-layer</guid><dc:creator><![CDATA[Mehdi Allahyari]]></dc:creator><pubDate>Thu, 26 Mar 2026 14:03:08 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!FXts!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97a75bee-0f2a-4272-a99a-b8f9862ee41d_1408x768.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!FXts!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97a75bee-0f2a-4272-a99a-b8f9862ee41d_1408x768.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!FXts!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97a75bee-0f2a-4272-a99a-b8f9862ee41d_1408x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!FXts!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97a75bee-0f2a-4272-a99a-b8f9862ee41d_1408x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!FXts!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97a75bee-0f2a-4272-a99a-b8f9862ee41d_1408x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!FXts!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97a75bee-0f2a-4272-a99a-b8f9862ee41d_1408x768.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!FXts!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97a75bee-0f2a-4272-a99a-b8f9862ee41d_1408x768.jpeg" width="1408" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/97a75bee-0f2a-4272-a99a-b8f9862ee41d_1408x768.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1408,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:547057,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://mlnotes.substack.com/i/192154633?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97a75bee-0f2a-4272-a99a-b8f9862ee41d_1408x768.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!FXts!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97a75bee-0f2a-4272-a99a-b8f9862ee41d_1408x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!FXts!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97a75bee-0f2a-4272-a99a-b8f9862ee41d_1408x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!FXts!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97a75bee-0f2a-4272-a99a-b8f9862ee41d_1408x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!FXts!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F97a75bee-0f2a-4272-a99a-b8f9862ee41d_1408x768.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Credit: Author via Nano Banana</figcaption></figure></div><p>I&#8217;ve been working with semantic web technologies since 2012. RDF, OWL, SPARQL, Protege, Jena, triplestores -- these have been part of my toolkit for over a decade. I&#8217;ve built ontologies for research projects, designed knowledge graphs for NLP pipelines, and written more SPARQL than I&#8217;d care to admit.</p><p>And throughout all of it, one frustration has remained constant: <strong>the gap between what knowledge graphs can do and what people are willing to put up with to use them. </strong></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://mlnotes.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">The MLnotes Newsletter is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>The technology is sound. The W3C stack is well-designed. Ontological reasoning is genuinely powerful. But the barrier to entry has always been too high. You need to understand RDF serialization, SPARQL syntax, OWL semantics, URI design, and named graph management before you can do anything useful. For most teams, the learning curve kills adoption before the value becomes visible.</p><p>LLMs have finally given us the tools to close that gap. So I built KeplAI -- an open-source platform that puts an AI layer on top of the standards-based semantic web stack, making knowledge graphs accessible without sacrificing the rigor that makes them worth using in the first place.</p><div><hr></div><h2>The Semantic Web&#8217;s Adoption Problem</h2><p>If you&#8217;ve worked with knowledge graphs, you know this story. You show someone a beautifully reasoned inference chain -- how <code>(Mehdi, founded, BrandPulse)</code> plus the ontology&#8217;s domain/range constraints automatically infers that Mehdi is a Person and BrandPulse is a Company. They&#8217;re impressed. Then you show them the SPARQL they&#8217;d need to write to query it, and their eyes glaze over.</p><p>The semantic web has always had a marketing problem disguised as a usability problem. The underlying ideas -- triples, ontologies, inference, linked data -- are elegant. But the tooling has historically assumed that everyone using it has a PhD in knowledge representation.</p><p>I don&#8217;t think the answer is to dumb it down. Stripping away the ontology layer to make things &#8220;simpler&#8221; defeats the purpose. Without formal semantics, you just have a labeled property graph -- fine for some use cases, but you lose inference, auto-typing, and schema validation. You lose the reason knowledge graphs are more powerful than a relational database in the first place.</p><p>The answer is to <strong>keep the semantic rigor underneath and put a natural-language interface on top.</strong></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!wB4x!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c375a22-083e-4d81-8398-c441b41d1668_1408x768.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!wB4x!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c375a22-083e-4d81-8398-c441b41d1668_1408x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!wB4x!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c375a22-083e-4d81-8398-c441b41d1668_1408x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!wB4x!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c375a22-083e-4d81-8398-c441b41d1668_1408x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!wB4x!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c375a22-083e-4d81-8398-c441b41d1668_1408x768.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!wB4x!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c375a22-083e-4d81-8398-c441b41d1668_1408x768.jpeg" width="1408" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0c375a22-083e-4d81-8398-c441b41d1668_1408x768.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1408,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:758478,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://mlnotes.substack.com/i/192154633?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c375a22-083e-4d81-8398-c441b41d1668_1408x768.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!wB4x!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c375a22-083e-4d81-8398-c441b41d1668_1408x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!wB4x!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c375a22-083e-4d81-8398-c441b41d1668_1408x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!wB4x!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c375a22-083e-4d81-8398-c441b41d1668_1408x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!wB4x!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c375a22-083e-4d81-8398-c441b41d1668_1408x768.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Credit: Author via Nano Banana</figcaption></figure></div><h2>What Knowledge Graphs Actually Give You</h2><p>For readers who haven&#8217;t worked with these technologies before, let me explain why anyone would bother with all this machinery instead of just using a regular database.</p><p>A knowledge graph stores facts as <strong>triples</strong>: subject, predicate, object.</p><pre><code><code>(Mehdi, founded, BrandPulse)

(BrandPulse, industry, AI)

(Alice, worksAt, BrandPulse)</code></code></pre><p>Each triple is a fact. Together, they form a graph you can traverse. Want to know who works at companies Mehdi founded? Follow the edges. No JOINs, no pre-designed queries, no schema migration when you add a new relationship type.</p><p>But the real power comes from <strong>ontologies</strong> -- formal descriptions of what types of things exist and how they can relate.</p><h3>Why Ontologies Matter</h3><p>An ontology doesn&#8217;t just describe data. It enables reasoning.</p><p>Define this:</p><pre><code><code>Class: Person

Class: Company

Property: founded (domain: Person, range: Company)</code></code></pre><p>Now when someone adds <code>(Mehdi, founded, BrandPulse)</code>, the system <em>automatically infers</em> that Mehdi is a Person and BrandPulse is a Company. This is called auto-typing, and after working with it for over a decade, I can tell you it&#8217;s one of those capabilities that seems minor until you realize how much manual annotation it eliminates at scale.</p><p>Ontologies also give you <strong>constraint validation</strong> (is this triple even valid given the schema?), <strong>cross-domain reasoning</strong> (if a Person can found a Company, and a Company has Employees, what can we infer about the relationship between founders and employees?), and <strong>interoperability</strong> (your ontology and mine can be linked through shared upper ontologies).</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!iaFF!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e788bd0-300e-445b-a392-7df33870db7c_1408x768.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!iaFF!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e788bd0-300e-445b-a392-7df33870db7c_1408x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!iaFF!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e788bd0-300e-445b-a392-7df33870db7c_1408x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!iaFF!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e788bd0-300e-445b-a392-7df33870db7c_1408x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!iaFF!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e788bd0-300e-445b-a392-7df33870db7c_1408x768.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!iaFF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e788bd0-300e-445b-a392-7df33870db7c_1408x768.jpeg" width="1408" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3e788bd0-300e-445b-a392-7df33870db7c_1408x768.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1408,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:509264,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://mlnotes.substack.com/i/192154633?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e788bd0-300e-445b-a392-7df33870db7c_1408x768.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!iaFF!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e788bd0-300e-445b-a392-7df33870db7c_1408x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!iaFF!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e788bd0-300e-445b-a392-7df33870db7c_1408x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!iaFF!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e788bd0-300e-445b-a392-7df33870db7c_1408x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!iaFF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3e788bd0-300e-445b-a392-7df33870db7c_1408x768.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Credit: Author via Nano Banana</figcaption></figure></div><h3>The Multi-Ontology Problem</h3><p>Real-world knowledge doesn&#8217;t fit into one neat schema. This is something I&#8217;ve dealt with repeatedly across projects. A hospital needs a medical ontology, an organizational one, and a geographic one. A research lab might combine domain-specific ontologies with FOAF, Dublin Core, and Schema.org.</p><p>These ontologies overlap. Both FOAF and Schema.org define &#8220;Person.&#8221; When your data says &#8220;Person,&#8221; which one do you mean? When two ontologies define different properties with the same label, which takes precedence?</p><p>In KeplAI, each imported ontology lives in its own named graph inside Apache Jena Fuseki. URI resolution searches across all loaded ontologies. If there&#8217;s a unique match, it resolves. If two ontologies claim the same label, the system raises a conflict rather than silently picking one. This is the kind of design decision that comes from having been burned by silent ontology collisions more than once.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!5Ilc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F768ac0cb-3df3-4f5d-986e-e2cbc11b3ca5_1408x768.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!5Ilc!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F768ac0cb-3df3-4f5d-986e-e2cbc11b3ca5_1408x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!5Ilc!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F768ac0cb-3df3-4f5d-986e-e2cbc11b3ca5_1408x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!5Ilc!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F768ac0cb-3df3-4f5d-986e-e2cbc11b3ca5_1408x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!5Ilc!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F768ac0cb-3df3-4f5d-986e-e2cbc11b3ca5_1408x768.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!5Ilc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F768ac0cb-3df3-4f5d-986e-e2cbc11b3ca5_1408x768.jpeg" width="1408" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/768ac0cb-3df3-4f5d-986e-e2cbc11b3ca5_1408x768.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1408,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:649347,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://mlnotes.substack.com/i/192154633?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F768ac0cb-3df3-4f5d-986e-e2cbc11b3ca5_1408x768.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!5Ilc!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F768ac0cb-3df3-4f5d-986e-e2cbc11b3ca5_1408x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!5Ilc!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F768ac0cb-3df3-4f5d-986e-e2cbc11b3ca5_1408x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!5Ilc!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F768ac0cb-3df3-4f5d-986e-e2cbc11b3ca5_1408x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!5Ilc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F768ac0cb-3df3-4f5d-986e-e2cbc11b3ca5_1408x768.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Credit: Author via Nano Banana</figcaption></figure></div><h2>What LLMs Finally Make Possible</h2><p>I&#8217;ve watched the semantic web community struggle with two problems for years. LLMs don&#8217;t solve them perfectly, but they solve them well enough to fundamentally change the equation.</p><h3>Populating the Graph</h3><p>Knowledge graph construction has always been the bottleneck. I&#8217;ve used every approach:</p><ul><li><p><strong>Manual curation</strong> -- accurate but impossibly slow. I&#8217;ve spent weeks annotating corpora for research projects that produced graphs with a few thousand triples.</p></li><li><p><strong>Rule-based extraction</strong> -- fragile. Change the sentence structure and your regex pipeline breaks.</p></li><li><p><strong>Supervised NER/RE models</strong> -- requires labeled training data, which requires... manual curation. A chicken-and-egg problem.</p></li></ul><p>LLMs break this cycle. Give GPT-4 a paragraph and an ontology schema, and it extracts structured triples that actually conform to the schema:</p><pre><code><code>triples = await kg.extract_and_store(

"Mehdi founded BrandPulse in 2024. The company focuses on AI.",

mode="strict" # Constrained to ontology schema

)</code></code></pre><p>In &#8220;strict&#8221; mode, extraction is constrained to properties defined in your ontology -- the LLM can only produce triples that match your schema. In &#8220;open&#8221; mode, it discovers relationships freely, which is useful for exploratory graph building.</p><p>But extraction alone isn&#8217;t enough. Text says &#8220;M. Allahyari&#8221; and the graph already has &#8220;Mehdi Allahyari&#8221; -- are they the same entity? This <strong>entity disambiguation</strong> problem is something I&#8217;ve worked on extensively. KeplAI uses vector embeddings (OpenAI&#8217;s embedding model + Qdrant vector store) to match extracted entities against existing ones. The user sees candidates ranked by similarity score and can confirm or reject. It&#8217;s not perfect -- &#8220;Apple&#8221; the company vs. &#8220;apple&#8221; the fruit still requires contextual reasoning that embeddings alone can&#8217;t fully capture -- but it handles the 80% case that used to require manual reconciliation.</p><h3>Querying the Graph</h3><p>SPARQL is a powerful query language. I write it fluently after years of practice. But I also know that asking a domain expert to write this is unreasonable:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;sql&quot;,&quot;nodeId&quot;:&quot;5f07f9f7-0bdf-4345-a329-ebadc8ba1f5b&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-sql">PREFIX entity: &lt;http://keplai.io/entity/&gt;
PREFIX ontology: &lt;http://keplai.io/ontology/&gt;
SELECT ?company WHERE {
  GRAPH ?g {
    entity:Mehdi ontology:founded ?company .
    ?company ontology:industry "AI" .
  }
}</code></pre></div><p>They want to ask: <em>&#8220;What AI companies did Mehdi found?&#8221;</em></p><p>Translating natural language to SPARQL is where I&#8217;ve spent the most engineering effort in KeplAI, because this is where naive LLM usage falls apart spectacularly.</p><p>The core problem: <strong>LLMs invent predicates.</strong> Ask GPT-4 to generate SPARQL for &#8220;when was Tom Hanks born?&#8221; and it&#8217;ll confidently produce <code>ontology:birthDate</code> -- even when your graph uses <code>ontology:bornOn</code>. The LLM&#8217;s world knowledge overrides the schema context you provided. It&#8217;s the same hallucination problem, just manifesting in a structured output.</p><p>KeplAI&#8217;s NLQ engine addresses this with a multi-stage pipeline:</p><ol><li><p><strong>Relation mapping</strong> -- Before generating SPARQL, an LLM call maps natural-language phrases (&#8221;works at,&#8221; &#8220;born in&#8221;) to actual predicates in your graph. This produces explicit mappings that get injected into the generation prompt.</p></li><li><p><strong>Entity resolution</strong> -- Entity mentions are extracted from the question and resolved against the graph via vector similarity. &#8220;Tom Hanks&#8221; becomes <code>entity:ThomasJeffreyHanks</code> (or whatever your graph uses).</p></li><li><p><strong>Schema-grounded generation</strong> -- The SPARQL generation prompt includes the full property list (with descriptions), the entity mappings, the relation mappings, and explicit instructions to use only listed predicates.</p></li><li><p><strong>Predicate validation</strong> -- After generation, every URI in the output SPARQL is checked against the allowed set. If the LLM invented <code>ontology:birthDate</code> when only <code>ontology:bornOn</code> exists, it&#8217;s flagged.</p></li><li><p><strong>Self-repair</strong> -- Invalid predicates trigger a repair call: &#8220;Your query uses these invalid predicates. Here are the allowed ones. Fix it.&#8221; This catches the 10-15% of queries where the LLM ignores the schema despite explicit instructions.</p></li></ol><p>Is it perfect? No. Complex multi-hop queries still trip up sometimes. But for the vast majority of questions a domain expert would ask, it produces correct SPARQL -- and it shows the generated query alongside the results, so power users can verify and edit.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!q4RS!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff354bcc-19da-4b25-bf18-d8ae269f7bfe_1408x768.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!q4RS!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff354bcc-19da-4b25-bf18-d8ae269f7bfe_1408x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!q4RS!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff354bcc-19da-4b25-bf18-d8ae269f7bfe_1408x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!q4RS!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff354bcc-19da-4b25-bf18-d8ae269f7bfe_1408x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!q4RS!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff354bcc-19da-4b25-bf18-d8ae269f7bfe_1408x768.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!q4RS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff354bcc-19da-4b25-bf18-d8ae269f7bfe_1408x768.jpeg" width="1408" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ff354bcc-19da-4b25-bf18-d8ae269f7bfe_1408x768.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1408,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:706231,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://mlnotes.substack.com/i/192154633?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff354bcc-19da-4b25-bf18-d8ae269f7bfe_1408x768.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!q4RS!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff354bcc-19da-4b25-bf18-d8ae269f7bfe_1408x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!q4RS!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff354bcc-19da-4b25-bf18-d8ae269f7bfe_1408x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!q4RS!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff354bcc-19da-4b25-bf18-d8ae269f7bfe_1408x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!q4RS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fff354bcc-19da-4b25-bf18-d8ae269f7bfe_1408x768.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Credit: Author via Nano Banana</figcaption></figure></div><h2>Use Cases Where This Actually Matters</h2><h3>Research Knowledge Management</h3><p>This is closest to my own experience. A researcher imports domain ontologies (biomedical, legal, environmental), then feeds papers and documents through the extraction pipeline. The knowledge graph grows semi-automatically. They can ask &#8220;Which genes are associated with both diabetes and cardiovascular disease?&#8221; and get answers grounded in extracted data, with provenance tracking showing which paper each fact came from.</p><h3>Enterprise Knowledge Unification</h3><p>Organizations have knowledge trapped in wikis, Confluence pages, Slack threads, and people&#8217;s heads. A knowledge graph can unify this. Extract triples from documentation, connect them with organizational ontologies, and you have a queryable map of institutional knowledge. When someone leaves the company, their knowledge doesn&#8217;t leave with them.</p><h3>Compliance and Auditing</h3><p>Every triple in KeplAI can carry provenance: source document, extraction timestamp, method (manual vs. AI-extracted). In regulated industries -- healthcare, finance, legal -- being able to trace every fact back to its source isn&#8217;t a nice-to-have. It&#8217;s a requirement.</p><h3>AI-Powered Applications</h3><p>This is where I think the biggest opportunity lies. RAG (Retrieval-Augmented Generation) typically uses vector search over chunks of text. But knowledge graphs offer something vector search can&#8217;t: <strong>structured reasoning</strong>. Instead of retrieving &#8220;similar text,&#8221; you retrieve <em>facts</em> and their <em>logical connections</em>. An LLM grounded in a knowledge graph doesn&#8217;t just find relevant passages -- it follows relationship chains and provides answers with explicit provenance.</p><div><hr></div><h2>Hard-Won Technical Insights</h2><p>A few things I want to share from building this that aren&#8217;t obvious from the outside:</p><p><strong>Named graphs are essential but the ecosystem barely supports them.</strong> Almost every SPARQL tutorial teaches you to query the default graph. In practice, if you&#8217;re managing multiple ontologies with data isolation, everything lives in named graphs. This means every query needs a <code>GRAPH ?g { ... }</code> wrapper. Forget it, and you get zero results with zero helpful error messages. KeplAI auto-injects GRAPH clauses when the LLM forgets them -- a safety net born from painful debugging sessions.</p><p><strong>The LLM-to-SPARQL &#8220;last mile&#8221; is harder than NL-to-SQL.</strong> SQL has a relatively flat structure -- tables, columns, JOINs. SPARQL has PREFIX declarations, URI resolution, named graphs, blank nodes, OPTIONAL patterns, FILTER expressions, and the RDF data model underneath. The LLM needs to get all of this right simultaneously. The relation mapping and predicate validation layers exist because a single wrong URI means the query silently returns nothing.</p><p><strong>Ontologies compound in value over time.</strong> This is something you only appreciate after maintaining a knowledge graph for months or years. Early on, defining an ontology feels like overhead. Six months in, when auto-typing has correctly classified thousands of entities and inference has surfaced relationships you never explicitly stated, the upfront investment pays for itself many times over.</p><p><strong>Entity disambiguation remains the hardest unsolved problem.</strong> After years of working on this, I&#8217;m convinced that pure vector similarity gets you ~80% of the way. The remaining 20% -- contextual ambiguity, cross-language entities, temporal references -- requires hybrid approaches combining embeddings with graph structure and contextual reasoning. This is active work.</p><div><hr></div><h2>The Architecture</h2><p>For those who want to look under the hood:</p><p>KeplAI is SDK-first. The Python SDK (<code>keplai</code>) is the core -- it manages the graph, ontologies, and AI features and can be used standalone. The REST API (FastAPI) and Web UI (React + TypeScript) are layers on top.</p><p>The stack:</p><ul><li><p><strong>Apache Jena Fuseki</strong> -- standards-compliant RDF triplestore with SPARQL endpoint and OWL reasoning. The SDK manages Fuseki&#8217;s Docker lifecycle automatically.</p></li><li><p><strong>OpenAI GPT-4</strong> -- powers extraction, NL-to-SPARQL generation, and result explanation.</p></li><li><p><strong>Qdrant</strong> -- vector similarity search for entity disambiguation.</p></li><li><p><strong>OWL/RDFS reasoning</strong> -- built into the triplestore, so inference happens at the data layer, not the application layer.</p></li></ul><p>The web UI provides a dashboard, triple management with batch operations, an ontology editor with multi-ontology support, a text extraction interface, a natural language query page, and an interactive force-directed graph explorer.</p><p>Open source under Apache 2.0: <a href="https://github.com/mallahyari/keplai">github.com/mallahyari/keplai</a>.</p><div><hr></div><h2>The Bigger Picture</h2><p>I&#8217;ve been in the semantic web world long enough to have seen multiple hype cycles come and go. Linked Data. The original Semantic Web vision. Knowledge graphs in enterprise. Each wave brought real progress but never achieved mainstream adoption.</p><p>This time feels different, and the reason is LLMs.</p><p>The fundamental thesis: <strong>LLMs are great at understanding language but hallucinate facts. Knowledge graphs are great at storing facts but are hard to query naturally.</strong> They&#8217;re complementary in a way that&#8217;s almost too clean.</p><p>LLMs as the <em>interface</em> to structured knowledge -- with the knowledge graph providing ground truth, provenance, and reasoning, and the LLM providing the natural-language bridge -- is, I believe, how the next generation of knowledge-intensive AI applications will be built. Not chatbots that make things up, but systems that reason over verified facts and can show their work.</p><p>KeplAI is my attempt to make that vision practical today. It&#8217;s not finished -- the NLQ pipeline needs more work on complex queries, the disambiguation could be smarter, and there are a dozen features on the roadmap. But it&#8217;s usable now, and the best way to improve it is for people to use it and push against the edges.</p><p>If you&#8217;ve been curious about knowledge graphs but put off by the tooling, or if you&#8217;re a semantic web veteran wondering how LLMs fit into the picture, give it a try.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!yOXJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee532710-6f99-467f-b576-6370930513b2_1408x768.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!yOXJ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee532710-6f99-467f-b576-6370930513b2_1408x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!yOXJ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee532710-6f99-467f-b576-6370930513b2_1408x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!yOXJ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee532710-6f99-467f-b576-6370930513b2_1408x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!yOXJ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee532710-6f99-467f-b576-6370930513b2_1408x768.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!yOXJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee532710-6f99-467f-b576-6370930513b2_1408x768.jpeg" width="1408" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ee532710-6f99-467f-b576-6370930513b2_1408x768.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1408,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:876297,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://mlnotes.substack.com/i/192154633?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee532710-6f99-467f-b576-6370930513b2_1408x768.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!yOXJ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee532710-6f99-467f-b576-6370930513b2_1408x768.jpeg 424w, https://substackcdn.com/image/fetch/$s_!yOXJ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee532710-6f99-467f-b576-6370930513b2_1408x768.jpeg 848w, https://substackcdn.com/image/fetch/$s_!yOXJ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee532710-6f99-467f-b576-6370930513b2_1408x768.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!yOXJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fee532710-6f99-467f-b576-6370930513b2_1408x768.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Credit: Author via Nano Banana</figcaption></figure></div><div><hr></div><p><em>KeplAI is open source and available at <a href="https://github.com/mallahyari/keplai">github.com/mallahyari/keplai</a>.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://mlnotes.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">The MLnotes Newsletter is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Why I Chose a 4,000-Line AI Assistant Over Ones With 430,000]]></title><description><![CDATA[In a world of bloated AI agents, the best personal assistant might be the one you can actually read.]]></description><link>https://mlnotes.substack.com/p/why-i-chose-a-4000-line-ai-assistant</link><guid isPermaLink="false">https://mlnotes.substack.com/p/why-i-chose-a-4000-line-ai-assistant</guid><dc:creator><![CDATA[Mehdi Allahyari]]></dc:creator><pubDate>Mon, 02 Mar 2026 15:27:58 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!QWkA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa29a8451-0196-4e3c-98cf-aecff346b37a_800x524.gif" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!QWkA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa29a8451-0196-4e3c-98cf-aecff346b37a_800x524.gif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!QWkA!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa29a8451-0196-4e3c-98cf-aecff346b37a_800x524.gif 424w, https://substackcdn.com/image/fetch/$s_!QWkA!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa29a8451-0196-4e3c-98cf-aecff346b37a_800x524.gif 848w, https://substackcdn.com/image/fetch/$s_!QWkA!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa29a8451-0196-4e3c-98cf-aecff346b37a_800x524.gif 1272w, https://substackcdn.com/image/fetch/$s_!QWkA!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa29a8451-0196-4e3c-98cf-aecff346b37a_800x524.gif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!QWkA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa29a8451-0196-4e3c-98cf-aecff346b37a_800x524.gif" width="800" height="524" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a29a8451-0196-4e3c-98cf-aecff346b37a_800x524.gif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:524,&quot;width&quot;:800,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:819780,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/gif&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://mlnotes.substack.com/i/189607886?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa29a8451-0196-4e3c-98cf-aecff346b37a_800x524.gif&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!QWkA!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa29a8451-0196-4e3c-98cf-aecff346b37a_800x524.gif 424w, https://substackcdn.com/image/fetch/$s_!QWkA!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa29a8451-0196-4e3c-98cf-aecff346b37a_800x524.gif 848w, https://substackcdn.com/image/fetch/$s_!QWkA!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa29a8451-0196-4e3c-98cf-aecff346b37a_800x524.gif 1272w, https://substackcdn.com/image/fetch/$s_!QWkA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa29a8451-0196-4e3c-98cf-aecff346b37a_800x524.gif 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>If you&#8217;ve been paying attention to the open-source AI space lately, you&#8217;ve probably noticed the explosion of personal AI assistants. It started with <a href="https://github.com/openclaw/openclaw">OpenClaw</a> &#8212; a powerful, feature-rich agent that took GitHub by storm with 240k+ stars. Then came the inevitable: everyone wanted a lighter version.</p><p>ZeroClaw rewrote it in Rust. PicoClaw rebuilt it in Go for $10 hardware. IronClaw focused on security with WebAssembly sandboxing. NanoClaw stripped it to 500 lines.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://mlnotes.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">The MLnotes Newsletter is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>I tried several of them. And then I found <strong><a href="https://github.com/HKUDS/nanobot">Nanobot</a></strong> &#8212; a 4,000-line Python alternative &#8212; and stopped looking.</p><p>Here&#8217;s why.</p><h2>The Problem With OpenClaw</h2><p>Don&#8217;t get me wrong &#8212; OpenClaw is impressive. It supports dozens of messaging platforms, has 100+ skills, and can genuinely automate complex workflows. But it comes at a cost:</p><ul><li><p><strong>430,000+ lines</strong> of TypeScript</p></li><li><p><strong>70+ dependencies</strong></p></li><li><p><strong>52+ modules</strong> and 53 config files</p></li><li><p>Individual files that run thousands of lines deep</p></li></ul><p>I wanted a personal assistant I could <em>own</em> &#8212; not just use, but understand end-to-end, modify confidently, and extend without fear of breaking some distant module I&#8217;d never read. OpenClaw wasn&#8217;t that.</p><h2>The Alternatives Didn&#8217;t Quite Fit Either</h2><p>The community responded with lighter alternatives, and they&#8217;re genuinely impressive:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!yuRf!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F924e32ae-351f-4913-b63c-6373a9bfc861_1328x520.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!yuRf!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F924e32ae-351f-4913-b63c-6373a9bfc861_1328x520.png 424w, https://substackcdn.com/image/fetch/$s_!yuRf!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F924e32ae-351f-4913-b63c-6373a9bfc861_1328x520.png 848w, https://substackcdn.com/image/fetch/$s_!yuRf!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F924e32ae-351f-4913-b63c-6373a9bfc861_1328x520.png 1272w, https://substackcdn.com/image/fetch/$s_!yuRf!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F924e32ae-351f-4913-b63c-6373a9bfc861_1328x520.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!yuRf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F924e32ae-351f-4913-b63c-6373a9bfc861_1328x520.png" width="1328" height="520" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/924e32ae-351f-4913-b63c-6373a9bfc861_1328x520.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:520,&quot;width&quot;:1328,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:95748,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://mlnotes.substack.com/i/189607886?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F924e32ae-351f-4913-b63c-6373a9bfc861_1328x520.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!yuRf!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F924e32ae-351f-4913-b63c-6373a9bfc861_1328x520.png 424w, https://substackcdn.com/image/fetch/$s_!yuRf!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F924e32ae-351f-4913-b63c-6373a9bfc861_1328x520.png 848w, https://substackcdn.com/image/fetch/$s_!yuRf!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F924e32ae-351f-4913-b63c-6373a9bfc861_1328x520.png 1272w, https://substackcdn.com/image/fetch/$s_!yuRf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F924e32ae-351f-4913-b63c-6373a9bfc861_1328x520.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>But here&#8217;s my issue: I think in Python. My entire ML/AI workflow is Python. When I look at a Rust codebase, I can appreciate the engineering &#8212; but I can&#8217;t casually add a new tool on a Saturday morning. And the 500-line options, while elegant, sacrifice too many features to be practical as a daily driver.</p><h2>Enter Nanobot: The Sweet Spot</h2><p>Nanobot delivers the core functionality of OpenClaw in roughly <strong>4,000 lines of Python</strong>. That&#8217;s 99% smaller. But the line count isn&#8217;t the real story &#8212; what matters is that you can read the <em>entire codebase in an afternoon</em> and understand every decision.</p><p>Here&#8217;s what&#8217;s packed into those 4,000 lines:</p><ul><li><p><strong>13 chat channels</strong> &#8212; Telegram, Discord, Slack, WhatsApp, Email, Matrix, and more</p></li><li><p><strong>19 LLM providers</strong> &#8212; Anthropic, OpenAI, DeepSeek, Groq, Gemini, OpenRouter...</p></li><li><p><strong>13 built-in tools</strong> &#8212; file ops, shell, web search, scheduled tasks, background agents</p></li><li><p><strong>Persistent memory</strong> &#8212; the bot remembers you across conversations</p></li><li><p><strong>A skill system</strong> &#8212; drop a markdown file in a folder and your bot gains new abilities</p></li><li><p><strong>MCP support</strong> &#8212; plug into the Model Context Protocol ecosystem</p></li></ul><p>And it&#8217;s not just small &#8212; it&#8217;s <em>clean</em>. The architecture follows a simple principle: <strong>channels don&#8217;t know about the AI, and the AI doesn&#8217;t know about channels.</strong> Everything talks through a message bus.</p><h2>The Architecture (in 30 Seconds)</h2><p>The whole system is three layers connected by async queues:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!dSgi!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd492d0bd-2127-4334-8ab6-d10fd9b1f606_5057x1830.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!dSgi!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd492d0bd-2127-4334-8ab6-d10fd9b1f606_5057x1830.png 424w, https://substackcdn.com/image/fetch/$s_!dSgi!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd492d0bd-2127-4334-8ab6-d10fd9b1f606_5057x1830.png 848w, https://substackcdn.com/image/fetch/$s_!dSgi!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd492d0bd-2127-4334-8ab6-d10fd9b1f606_5057x1830.png 1272w, https://substackcdn.com/image/fetch/$s_!dSgi!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd492d0bd-2127-4334-8ab6-d10fd9b1f606_5057x1830.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!dSgi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd492d0bd-2127-4334-8ab6-d10fd9b1f606_5057x1830.png" width="1456" height="527" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d492d0bd-2127-4334-8ab6-d10fd9b1f606_5057x1830.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:527,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:370838,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://mlnotes.substack.com/i/189607886?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd492d0bd-2127-4334-8ab6-d10fd9b1f606_5057x1830.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!dSgi!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd492d0bd-2127-4334-8ab6-d10fd9b1f606_5057x1830.png 424w, https://substackcdn.com/image/fetch/$s_!dSgi!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd492d0bd-2127-4334-8ab6-d10fd9b1f606_5057x1830.png 848w, https://substackcdn.com/image/fetch/$s_!dSgi!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd492d0bd-2127-4334-8ab6-d10fd9b1f606_5057x1830.png 1272w, https://substackcdn.com/image/fetch/$s_!dSgi!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd492d0bd-2127-4334-8ab6-d10fd9b1f606_5057x1830.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>A message comes in from any platform. The bus routes it to the agent loop. The agent builds a prompt (with your conversation history, long-term memory, and any active skills), calls the LLM, executes tools if needed, and sends the response back through the bus to whichever platform you&#8217;re on.</p><p>Here&#8217;s what a single request looks like end-to-end:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!NWUl!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06af3f3e-6331-41e0-9fd6-22d9dda038a5_3465x1650.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!NWUl!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06af3f3e-6331-41e0-9fd6-22d9dda038a5_3465x1650.png 424w, https://substackcdn.com/image/fetch/$s_!NWUl!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06af3f3e-6331-41e0-9fd6-22d9dda038a5_3465x1650.png 848w, https://substackcdn.com/image/fetch/$s_!NWUl!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06af3f3e-6331-41e0-9fd6-22d9dda038a5_3465x1650.png 1272w, https://substackcdn.com/image/fetch/$s_!NWUl!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06af3f3e-6331-41e0-9fd6-22d9dda038a5_3465x1650.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!NWUl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06af3f3e-6331-41e0-9fd6-22d9dda038a5_3465x1650.png" width="1456" height="693" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/06af3f3e-6331-41e0-9fd6-22d9dda038a5_3465x1650.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:693,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:258127,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://mlnotes.substack.com/i/189607886?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06af3f3e-6331-41e0-9fd6-22d9dda038a5_3465x1650.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!NWUl!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06af3f3e-6331-41e0-9fd6-22d9dda038a5_3465x1650.png 424w, https://substackcdn.com/image/fetch/$s_!NWUl!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06af3f3e-6331-41e0-9fd6-22d9dda038a5_3465x1650.png 848w, https://substackcdn.com/image/fetch/$s_!NWUl!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06af3f3e-6331-41e0-9fd6-22d9dda038a5_3465x1650.png 1272w, https://substackcdn.com/image/fetch/$s_!NWUl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F06af3f3e-6331-41e0-9fd6-22d9dda038a5_3465x1650.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>That&#8217;s it. The entire flow.</p><h2>What Makes It Easy to Customize</h2><h3>Adding a tool is ~30 lines</h3><p>Want your assistant to do something new? Every tool is a small class with one method:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;cc58b9d7-c926-4814-8a97-4d15c0090ec1&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">class MyTool(Tool):
    name = "check_stocks"
    description = "Check current stock prices"
    parameters = {
        "type": "object",
        "properties": {
            "symbol": {"type": "string", "description": "Stock ticker symbol"}
        },
        "required": ["symbol"]
    }

    async def execute(self, params: dict) -&gt; str:
        symbol = params["symbol"]
        # Your logic here &#8212; call an API, query a database, anything
        price = await fetch_stock_price(symbol)
        return f"{symbol}: ${price}"</code></pre></div><p>Register it in one line, and your bot can now check stock prices. No plugin system to learn, no config files to wrestle with, no build step.</p><h3>Adding a skill is just a markdown file</h3><p>Skills inject knowledge into your bot&#8217;s system prompt. Want your assistant to be an expert at managing your home automation?</p><p>Create <code>~/.nanobot/workspace/skills/smart-home/SKILL.md</code>:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;markdown&quot;,&quot;nodeId&quot;:&quot;50bde063-d716-4e48-9fd1-b111961b56dc&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-markdown"># Smart Home Assistant

You can control the user's home devices using the exec tool.
- Lights: `hue-cli set &lt;room&gt; &lt;brightness&gt;`
- Thermostat: `nest-cli temp &lt;degrees&gt;`
- Always confirm before changing settings.</code></pre></div><p>That&#8217;s a skill. Drop the file in the folder, and your bot knows how to manage your smart home on the next message.</p><h3>The memory system is beautifully simple</h3><p>Nanobot keeps two files:</p><ul><li><p><code>MEMORY.md</code> &#8212; long-term facts (&#8221;User prefers Python. Lives in San Francisco. Works on ML projects.&#8221;)</p></li><li><p><code>HISTORY.md</code> &#8212; timestamped event log</p></li></ul><p><code>MEMORY.md</code> gets injected into every prompt, so your assistant always knows who you are. When conversations get long, the agent summarizes old messages and saves the important bits. No vector database, no embeddings, no infrastructure &#8212; just markdown files you can read and edit yourself.</p><h2>The Honest Trade-offs</h2><p>Nanobot isn&#8217;t trying to replace OpenClaw for everyone. Here&#8217;s where it intentionally makes different choices:</p><ul><li><p><strong>No GUI</strong> &#8212; it&#8217;s channels + CLI. If you want a web dashboard, look elsewhere.</p></li><li><p><strong>Fewer built-in skills</strong> &#8212; OpenClaw has 100+; Nanobot has 8 (but adding your own is trivial).</p></li><li><p><strong>Single-process</strong> &#8212; it&#8217;s one Python process, not a distributed system. For a personal assistant, that&#8217;s a feature, not a bug.</p></li><li><p><strong>File-based storage</strong> &#8212; no database. Sessions are JSONL, memory is markdown, config is JSON. Simple and portable, but not built for multi-user scale.</p></li></ul><p>For a personal assistant that <em>you</em> run for <em>yourself</em>, these trade-offs are exactly right.</p><h2>Why This Matters</h2><p>There&#8217;s a broader point here that goes beyond any single project.</p><p>We&#8217;re in an era where AI tools are getting more powerful every month. The models improve, the APIs get cheaper, the capabilities expand. In this environment, <strong>the best AI assistant isn&#8217;t the one with the most features &#8212; it&#8217;s the one you can evolve alongside the models.</strong></p><p>When Claude gets better at reasoning, I want to adjust my agent&#8217;s tool execution strategy. When a new MCP server drops for my favorite service, I want to plug it in. When I have a weird personal workflow that no one else would build a plugin for, I want to code it myself in an hour.</p><p>That requires a codebase I can hold in my head. For me, that&#8217;s 4,000 lines of Python &#8212; not 430,000 lines of TypeScript or a Rust binary I&#8217;d need to recompile.</p><h2>Getting Started</h2><p>If this resonates, here&#8217;s the quickest path:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;bash&quot;,&quot;nodeId&quot;:&quot;f7473c3d-ab1b-46e4-a16b-84a6206cb3b3&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-bash"># Install
pip install nanobot

# First-time setup (creates config + workspace)
nanobot onboard

# Chat interactively
nanobot agent

# Or run as a multi-channel gateway
nanobot gateway</code></pre></div><p>Edit <code>~/.nanobot/config.json</code> to add your API keys and enable channels. </p><h2>Final Thought</h2><p>The <em>claw family</em> has given us incredible options. OpenClaw proved the concept. ZeroClaw and IronClaw showed us what&#8217;s possible in Rust. PicoClaw proved AI agents can run on tiny hardware.</p><p>Nanobot showed me that sometimes the most powerful tool is the one simple enough to be truly <em>yours</em>.</p><div><hr></div><p><em>If you&#8217;re building your own personal AI assistant or have thoughts on the claw ecosystem, I&#8217;d love to hear from you. Drop a comment or reach out, I&#8217;m always happy to talk about this stuff.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://mlnotes.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">The MLnotes Newsletter is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Building a Production-Ready SQL Agent with LangGraph]]></title><description><![CDATA[How to turn natural language into safe, accurate SQL queries using a multi-node agentic workflow]]></description><link>https://mlnotes.substack.com/p/building-a-production-ready-sql-agent</link><guid isPermaLink="false">https://mlnotes.substack.com/p/building-a-production-ready-sql-agent</guid><dc:creator><![CDATA[Mehdi Allahyari]]></dc:creator><pubDate>Sat, 21 Feb 2026 14:02:02 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!_dp6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb6dcbfbc-800d-4671-93f4-8d1861541254_800x1285.gif" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!_dp6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb6dcbfbc-800d-4671-93f4-8d1861541254_800x1285.gif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!_dp6!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb6dcbfbc-800d-4671-93f4-8d1861541254_800x1285.gif 424w, https://substackcdn.com/image/fetch/$s_!_dp6!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb6dcbfbc-800d-4671-93f4-8d1861541254_800x1285.gif 848w, https://substackcdn.com/image/fetch/$s_!_dp6!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb6dcbfbc-800d-4671-93f4-8d1861541254_800x1285.gif 1272w, https://substackcdn.com/image/fetch/$s_!_dp6!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb6dcbfbc-800d-4671-93f4-8d1861541254_800x1285.gif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!_dp6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb6dcbfbc-800d-4671-93f4-8d1861541254_800x1285.gif" width="800" height="1285" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b6dcbfbc-800d-4671-93f4-8d1861541254_800x1285.gif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1285,&quot;width&quot;:800,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1945227,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/gif&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://mlnotes.substack.com/i/188568231?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb6dcbfbc-800d-4671-93f4-8d1861541254_800x1285.gif&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!_dp6!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb6dcbfbc-800d-4671-93f4-8d1861541254_800x1285.gif 424w, https://substackcdn.com/image/fetch/$s_!_dp6!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb6dcbfbc-800d-4671-93f4-8d1861541254_800x1285.gif 848w, https://substackcdn.com/image/fetch/$s_!_dp6!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb6dcbfbc-800d-4671-93f4-8d1861541254_800x1285.gif 1272w, https://substackcdn.com/image/fetch/$s_!_dp6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb6dcbfbc-800d-4671-93f4-8d1861541254_800x1285.gif 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>Most tutorials show you a single LLM call that converts a question into SQL. That works on toy examples. It falls apart the moment a real user touches it.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://mlnotes.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">The MLnotes Newsletter is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p>Real databases have dozens of tables. Real users ask vague, ambiguous questions. And real SQL agents need to handle errors gracefully, retry intelligently, stream responses in real time, and never, ever run a <code>DROP TABLE</code> on your production database.</p><p>In this post, I&#8217;ll walk you through how I designed and built a production-ready SQL agent, covering the architecture, the key design decisions, and the lessons learned along the way. We&#8217;ll go light on boilerplate and heavy on the <em>why</em>.</p><p>The full code (backend + frontend with live visualizations) is on GitHub: <strong><a href="https://github.com/mallahyari/langgraph-sql-agent">github.com/mallahyari/langgraph-sql-agent</a></strong></p><div><hr></div><h2>The Stack</h2><ul><li><p><strong>LangGraph</strong>: multi-node agent orchestration with conditional routing and retry loops</p></li><li><p><strong>GPT-4o</strong>: reasoning, SQL generation, and natural language synthesis</p></li><li><p><strong>FastAPI + Server-Sent Events</strong>: async streaming API</p></li><li><p><strong>SQLite</strong>: database backend (the architecture is database-agnostic)</p></li><li><p><strong>React + Vega-Lite</strong>: frontend with auto-generated charts</p></li></ul><div><hr></div><h2>Why a Multi-Agent Approach?</h2><p>The naive approach is one LLM call: <em>&#8220;Here&#8217;s my schema. Here&#8217;s the user&#8217;s question. Write SQL.&#8221;</em></p><p>This breaks in predictable ways:</p><ul><li><p>The LLM picks the wrong tables when your database has many of them</p></li><li><p>Vague questions like &#8220;show me sales&#8221; generate ambiguous queries</p></li><li><p>When a query fails at runtime, there&#8217;s no recovery path</p></li><li><p>You have zero visibility into what went wrong or why</p></li></ul><p>The insight is that text-to-SQL isn&#8217;t one problem, it&#8217;s several smaller ones chained together. When you decompose it into <strong>specialized nodes</strong>, each with a single job, you get a system that&#8217;s easier to debug, easier to improve, and resilient to failure.</p><p>Think of it as a pipeline of domain experts:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!PfPu!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63513a84-eecf-4017-88f1-5f49194433d6_1494x884.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!PfPu!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63513a84-eecf-4017-88f1-5f49194433d6_1494x884.png 424w, https://substackcdn.com/image/fetch/$s_!PfPu!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63513a84-eecf-4017-88f1-5f49194433d6_1494x884.png 848w, https://substackcdn.com/image/fetch/$s_!PfPu!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63513a84-eecf-4017-88f1-5f49194433d6_1494x884.png 1272w, https://substackcdn.com/image/fetch/$s_!PfPu!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63513a84-eecf-4017-88f1-5f49194433d6_1494x884.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!PfPu!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63513a84-eecf-4017-88f1-5f49194433d6_1494x884.png" width="1456" height="862" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/63513a84-eecf-4017-88f1-5f49194433d6_1494x884.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:862,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:140306,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://mlnotes.substack.com/i/188568231?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63513a84-eecf-4017-88f1-5f49194433d6_1494x884.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!PfPu!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63513a84-eecf-4017-88f1-5f49194433d6_1494x884.png 424w, https://substackcdn.com/image/fetch/$s_!PfPu!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63513a84-eecf-4017-88f1-5f49194433d6_1494x884.png 848w, https://substackcdn.com/image/fetch/$s_!PfPu!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63513a84-eecf-4017-88f1-5f49194433d6_1494x884.png 1272w, https://substackcdn.com/image/fetch/$s_!PfPu!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63513a84-eecf-4017-88f1-5f49194433d6_1494x884.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Each of these is a node in a <strong>LangGraph StateGraph</strong>. The graph connects them with conditional edges that handle routing, retries, and error recovery automatically.</p><div><hr></div><h2>The Architecture</h2><p>Here&#8217;s the full workflow:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!qIfs!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc106cbc-4e87-4e18-8c5c-a0360f45c49d_1297x3187.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!qIfs!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc106cbc-4e87-4e18-8c5c-a0360f45c49d_1297x3187.png 424w, https://substackcdn.com/image/fetch/$s_!qIfs!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc106cbc-4e87-4e18-8c5c-a0360f45c49d_1297x3187.png 848w, https://substackcdn.com/image/fetch/$s_!qIfs!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc106cbc-4e87-4e18-8c5c-a0360f45c49d_1297x3187.png 1272w, https://substackcdn.com/image/fetch/$s_!qIfs!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc106cbc-4e87-4e18-8c5c-a0360f45c49d_1297x3187.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!qIfs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc106cbc-4e87-4e18-8c5c-a0360f45c49d_1297x3187.png" width="1297" height="3187" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fc106cbc-4e87-4e18-8c5c-a0360f45c49d_1297x3187.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:3187,&quot;width&quot;:1297,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:225560,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://mlnotes.substack.com/i/188568231?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc106cbc-4e87-4e18-8c5c-a0360f45c49d_1297x3187.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!qIfs!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc106cbc-4e87-4e18-8c5c-a0360f45c49d_1297x3187.png 424w, https://substackcdn.com/image/fetch/$s_!qIfs!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc106cbc-4e87-4e18-8c5c-a0360f45c49d_1297x3187.png 848w, https://substackcdn.com/image/fetch/$s_!qIfs!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc106cbc-4e87-4e18-8c5c-a0360f45c49d_1297x3187.png 1272w, https://substackcdn.com/image/fetch/$s_!qIfs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffc106cbc-4e87-4e18-8c5c-a0360f45c49d_1297x3187.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>The <strong>retry loop</strong> is the key architectural decision: if the SQL fails validation <em>or</em> fails at runtime, the graph routes back to the SQL generator with the error message injected into the conversation. The LLM sees what went wrong and corrects itself , up to 3 times before giving up gracefully.</p><p>This is what separates a demo from a system you can actually deploy.</p><div><hr></div><h2>How LangGraph Orchestrates It</h2><p>LangGraph is built on the concept of a <code>StateGraph</code>, a directed graph where each node reads from and writes to a shared state object. Here&#8217;s the shape of that state:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;d7cc414c-a5eb-4da8-a518-b9cee3415065&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">class AgentState(TypedDict):
    user_query: str
    refined_query: Optional[str]
    relevance: str
    selected_tables: List[str]
    generated_sql: str
    query_result: List[Dict[str, Any]]
    query_error: Optional[str]
    is_valid_sql: bool
    retry_count: int
    validation_error: Optional[str]
    natural_response: str
    needs_visualization: bool
    visualization_spec: Optional[Dict[str, Any]]
    logs: Annotated[List[str], operator.add]
    steps: Annotated[List[str], operator.add]
</code></pre></div><p>One thing worth calling out: the <code>Annotated[List[str], operator.add]</code> on <code>logs</code> and <code>steps</code>. This tells LangGraph to <em>append</em> to these lists rather than overwrite them. Every node contributes its own entries, and you end up with a full audit trail of what happened at each step, invaluable for debugging.</p><p>Routing between nodes is handled by plain Python functions:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;687c7536-c10c-49e4-8648-5d312e945f8e&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">def route_after_execution(state: AgentState) -&gt; str:
    if state.get("query_error") and state.get("retry_count", 0) &lt; MAX_RETRIES:
        return "sql_generator"
    return "response_synthesizer"</code></pre></div><p>Simple, readable, and easy to change. The orchestration logic lives in the graph definition, completely separate from the agent implementations.</p><div><hr></div><h2>Key Design Decisions</h2><h3>1. Use Function Calling for Structured Decisions</h3><p>Any time a node needs to produce a categorical output, &#8220;is this relevant?&#8221;, &#8220;which tables?&#8221;, &#8220;does this need a chart?&#8221; &#8212; use OpenAI&#8217;s function calling with an <code>enum</code> constraint instead of parsing free text.</p><p>Free text parsing is fragile. The model might say &#8220;Yes, this is relevant&#8221; or just &#8220;Relevant&#8221; or &#8220;I believe the query relates to...&#8221;, you end up writing brittle string matching. Function calling gives you guaranteed structured output:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;7c5c950c-62d0-4f59-95b9-03b858336d7c&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">"relevance": {
    "type": "string",
    "enum": ["relevant", "irrelevant"]
}</code></pre></div><p>One call, guaranteed valid result. Use this pattern for every decision node in your graph.</p><h3>2. Always Ground-Truth Check LLM Outputs</h3><p>The table selector asks the LLM which tables are needed, then does this:</p><div class="highlighted_code_block" data-attrs="{&quot;language&quot;:&quot;python&quot;,&quot;nodeId&quot;:&quot;3d464f12-7c51-47a4-9cbe-26f37522a01a&quot;}" data-component-name="HighlightedCodeBlockToDOM"><pre class="shiki"><code class="language-python">valid_tables = [t for t in args["tables"] if t in all_tables]</code></pre></div><p>That one line prevents a surprisingly common failure: LLMs confidently hallucinate table names that don&#8217;t exist. Always cross-reference any LLM output that refers to real-world entities (table names, column names, file paths) against the actual ground truth before passing it downstream.</p><p>Same principle applies to column names in the SQL generator, feed the LLM the exact <code>CREATE TABLE</code> schema, not a description of it.</p><h3>3. Raise Temperature on Retries</h3><p>At <code>temperature=0</code>, GPT-4o is deterministic. If the first SQL query fails and you retry with the same temperature, you get the exact same broken query. Set temperature to <code>0.3</code> on retries,  just enough variation that the model explores a different approach, without going off the rails.</p><h3>4. Inject Error Context Into the Retry Conversation</h3><p>On retry, don&#8217;t just run the node again. Pass the failed query <em>and</em> the error message back to the SQL generator as prior conversation turns:</p><pre><code><code>System: [schema + instructions]

User: [original query]

Assistant: [the broken SQL that failed]

User: "That query failed with: no such column 'TotalRevenue'. Please fix it."</code></code></pre><p>The LLM now has the full context of what it tried and why it failed. This dramatically improves retry success rate compared to starting fresh.</p><h3>5. Validate SQL Before It Touches Your Database</h3><p>A safety layer that checks for destructive keywords (<code>DROP</code>, <code>DELETE</code>, <code>TRUNCATE</code>, <code>UPDATE</code>, etc.) using regex word boundaries is non-negotiable. This blocks both accidental and malicious writes before anything reaches the database.</p><pre><code><code>pattern = r'\b' + keyword + r'\b'</code></code></pre><p>Word boundaries matter, without them, a column named <code>dropship_count</code> would trigger a false positive.</p><h3>6. Truncate Large Results Before Synthesis</h3><p>Some queries return thousands of rows. Feeding all of them to the synthesizer is expensive and often exceeds context limits. Truncate to a reasonable character limit before synthesis. The synthesizer&#8217;s job is to <em>summarize</em>, not to enumerate every row, so you rarely lose anything meaningful.</p><h3>7. Thread IDs for User Session Isolation</h3><p>LangGraph&#8217;s <code>MemorySaver</code> checkpointer scopes conversation history to a <code>thread_id</code>. Pass a unique ID per user session, and multiple concurrent users share one deployment with fully isolated state. No session collisions, no state bleed-through.</p><p>For production, swap <code>MemorySaver</code> for a persistent checkpointer backed by Redis or Postgres so history survives server restarts.</p><div><hr></div><h2>Streaming: Why It Matters More Than You Think</h2><p>A SQL query can take 3-5 seconds end-to-end. Without streaming, your user stares at a blank screen and assumes something broke.</p><p>The architecture uses two streaming modes simultaneously via LangGraph&#8217;s <code>astream</code>:</p><ul><li><p><code>updates</code><strong> mode</strong>: fires after each node completes, sending structured data (SQL generated, rows returned, etc.) that the frontend can display progressively</p></li><li><p><code>custom</code><strong> mode</strong>: fires on every LLM token during synthesis, enabling word-by-word text streaming</p></li></ul><p>The result is a UI that feels responsive and alive even during a multi-second pipeline. The user sees the SQL appear, then the row count, then the explanation streaming in, then the chart, all without waiting for the full workflow to finish.</p><p>This is delivered over <strong>Server-Sent Events (SSE)</strong> via a FastAPI <code>StreamingResponse</code>. SSE is simpler than WebSockets for this use case: unidirectional, HTTP-native, no handshake overhead.</p><p>One gotcha: if you run behind Nginx, add <code>X-Accel-Buffering: no</code> to your response headers. Without it, Nginx buffers the entire response and defeats the purpose of streaming entirely.</p><div><hr></div><h2>The Visualization Layer</h2><p>One of the more interesting parts of the system is the automatic chart generation. After synthesizing a text response, a <strong>visualization planner</strong> node decides whether a chart would add value, checking for explicit requests (&#8221;plot this&#8221;, &#8220;show a chart&#8221;) and analyzing the data shape (time series, categorical breakdown, single values).</p><p>If a visualization makes sense, a <strong>visualization generator</strong> node calls GPT-4o in JSON mode to produce a <strong>Vega-Lite spec</strong> directly from the query results. The frontend renders it instantly.</p><p>This means a query like &#8220;show me monthly revenue over the past year&#8221; automatically produces a line chart, with no hardcoded chart types and no user configuration required. The LLM figures out the right encoding, axes, and title from the data itself.</p><p>Vega-Lite is a great fit here because it&#8217;s declarative JSON, the LLM can produce it reliably in JSON mode, and the frontend renders it without needing to interpret any code.</p><div><hr></div><h2>What the Frontend Does</h2><p>The backend streams JSON events. The React frontend listens and progressively renders:</p><ol><li><p>A &#8220;thinking&#8221; indicator while nodes are running</p></li><li><p>The generated SQL in a syntax-highlighted code block (so users can inspect it)</p></li><li><p>The row count and raw data in a table</p></li><li><p>The natural language explanation, streaming token by token</p></li><li><p>The Vega-Lite chart, rendered automatically when the visualization spec arrives</p></li></ol><p>Showing the SQL is a deliberate choice. It builds trust, users can see exactly what query was run, verify it makes sense, and catch errors. It also makes the system feel transparent rather than like a black box.</p><div><hr></div><h2>Common Failure Modes (And How to Avoid Them)</h2><p><strong>Hallucinated column names</strong>. The LLM makes up a column that doesn&#8217;t exist. Fix: always pass the exact <code>CREATE TABLE</code> DDL as schema context, not a paraphrased description.</p><p><strong>Overly broad table selection</strong>. The table selector includes every table &#8220;just in case,&#8221; bloating the schema context. Fix: require the LLM to justify each selected table, or use strict function calling with an explicit count limit.</p><p><strong>ORDER BY placement in UNION queries</strong>. SQLite requires <code>ORDER BY</code> to appear after the final <code>SELECT</code> in a <code>UNION ALL</code>. The LLM often puts it in the wrong place. Fix: add this as an explicit rule in the SQL generator&#8217;s system prompt.</p><p><strong>Markdown in SQL output</strong>. The model wraps its SQL in <code>```sql ```</code> fences. Always strip these before executing.</p><p><strong>Retry storms</strong>. Without a hard retry cap, a broken query can loop indefinitely. Enforce a maximum retry count in state and route to a graceful error response when exceeded.</p><div><hr></div><h2>Going Further</h2><p>This architecture is a foundation. A few directions worth exploring:</p><p><strong>Scale to larger schemas</strong>. With hundreds of tables, feeding even table names to the LLM becomes unwieldy. Add a vector similarity search step to retrieve semantically relevant tables before the table selector node runs.</p><p><strong>Row-level security</strong>. The executor currently runs all queries with the same database credentials. In a multi-tenant system, inject user-specific filters or use separate database roles per user.</p><p><strong>Human-in-the-loop</strong>. LangGraph has native support for breakpoints and <code>interrupt_before</code>, you can pause the graph before execution and require a human to approve the generated SQL. Useful in high-stakes environments.</p><p><strong>Swap the LLM</strong>. The OpenAI client lives in a single service file. Replace it with Anthropic&#8217;s Claude, a local Ollama model, or any OpenAI-compatible endpoint. The rest of the system stays the same.</p><p><strong>Persistent memory</strong>. Right now, conversation history lives in memory and resets on restart. Swap <code>MemorySaver</code> for LangGraph&#8217;s <code>AsyncPostgresSaver</code> for durable multi-session memory.</p><div><hr></div><h2>The Full Picture</h2><p>What makes this system work in practice isn&#8217;t any single clever trick, it&#8217;s the combination:</p><ul><li><p><strong>Specialization</strong>: each node does one thing and does it well</p></li><li><p><strong>Retry with context</strong>: the graph corrects its own mistakes with full error awareness</p></li><li><p><strong>Safety by design</strong>: SQL is validated before it ever reaches the database</p></li><li><p><strong>Streaming at every layer</strong>: users see progress, not spinners</p></li><li><p><strong>Grounded outputs</strong>: LLM results are always checked against reality before use</p></li><li><p><strong>Separation of concerns</strong>: orchestration logic lives in the graph, not in the nodes</p></li></ul><p>The LangGraph pattern, decompose into nodes, connect with conditional edges, add retry loops, generalizes far beyond SQL. Any problem where a single LLM call is too brittle is a candidate for this architecture.</p><div><hr></div><p>The complete code for both the backend and the React frontend with live Vega-Lite visualization is available here:</p><p><strong><a href="https://github.com/mallahyari/langgraph-sql-agent">github.com/mallahyari/langgraph-sql-agent</a></strong></p><p>If you build something with this or adapt it for your own database, I&#8217;d love to hear what you changed and why.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://mlnotes.substack.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">The MLnotes Newsletter is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[GenAI Week 2025: The AI Event of the Decade You Can't Afford to Miss]]></title><description><![CDATA[Free ticket available for MLnotes audience!]]></description><link>https://mlnotes.substack.com/p/genai-week-2025-the-ai-event-of-the</link><guid isPermaLink="false">https://mlnotes.substack.com/p/genai-week-2025-the-ai-event-of-the</guid><dc:creator><![CDATA[Angelina Yang]]></dc:creator><pubDate>Sun, 06 Jul 2025 16:07:36 GMT</pubDate><enclosure url="https://substackcdn.com/image/youtube/w_728,c_limit/GB3Vz-mdzYw" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="pullquote"><p><em>"From Stanford professors to OpenAI scientists, unicorn founders to quantum-tech pioneers, the future of AI is gathering in one place&#8212; and you're invited."</em></p></div><p>Mark your calendars for <strong>July 13-17, 2025</strong>, because the epicenter of the AI world will be at the Santa Clara Convention Center. GenAI Week Silicon Valley 2025 is shaping up to be the largest, most influential AI gathering of the decade. With over 30,000 attendees expected, this will be a glimpse into the future of artificial intelligence and its transformative impact across every industry.</p><h2>An Unprecedented Lineup of AI Luminaries</h2><p>The speaker list reads like a who's who of AI innovation:</p><ul><li><p><strong><a href="https://www.linkedin.com/in/jason-wei-5a7323b0/">Jason Wei</a></strong>,, OpenAI scientist behind GPT-3</p></li><li><p><strong><a href="https://www.linkedin.com/in/jain-arvind/">Arvind Jain</a></strong>, CEO of Glean ($7.2B valuation)</p></li><li><p><strong><a href="https://www.linkedin.com/in/edward-oates-47b064160/">Edward Oates</a></strong>, Co-Founder of Oracle</p></li><li><p><strong><a href="https://www.linkedin.com/in/soumyabatra/">Soumya Batra</a></strong>, core author of Llama 2 &amp; 3</p></li><li><p><strong><a href="https://www.linkedin.com/in/richardsocher/">Richard Socher</a></strong>, CEO of You.com ($900M valuation)</p></li></ul><p>And that's just scratching the surface. From startup founders pushing the boundaries of what's possible, to leaders at tech giants like Microsoft, Google, and Meta, the collective brainpower at this event is staggering.</p><h2>For startups, this is your chance to shine on a global stage. </h2><p>The Founder's Pitch Package offers the opportunity to present your vision to a room full of top-tier investors and decision-makers.</p><p><strong>&#128226; Do you want to pitch your startup on stage?</strong></p><p>Apply here:</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://docs.google.com/forms/d/e/1FAIpQLSe0n_Kbe_D5x3hs4zAmU28rCzwgLSn7OLzMTLf2EYkg1Mkfdg/viewform?usp=header&quot;,&quot;text&quot;:&quot;Apply here!&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://docs.google.com/forms/d/e/1FAIpQLSe0n_Kbe_D5x3hs4zAmU28rCzwgLSn7OLzMTLf2EYkg1Mkfdg/viewform?usp=header"><span>Apply here!</span></a></p><p>&#65288;Reminder, purchase the FOUDNERS PITCH PACKAGE at the same time)</p><h2>&#128200; Are you a VC attending the event?</h2><p>If you'd like curated intros, private sessions, or exclusive access &#8212; fill this: </p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://docs.google.com/forms/d/e/1FAIpQLSeWbXJASJvgGbCKNTThyC_ZYMOdcaScUECXfOlnbvgI7ThGmg/viewform&quot;,&quot;text&quot;:&quot;For VCs&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://docs.google.com/forms/d/e/1FAIpQLSeWbXJASJvgGbCKNTThyC_ZYMOdcaScUECXfOlnbvgI7ThGmg/viewform"><span>For VCs</span></a></p><h2>&#128218; Still deciding which day(s) to attend? Here more reasons:</h2><h4>100+ Sessions Covering the Future of AI</h4><p>With over 100 tactical sessions, you'll gain practical insights into:</p><ul><li><p>Advanced agent frameworks</p></li><li><p>AI monetization strategies</p></li><li><p>Ethical considerations in AI development</p></li><li><p>Industry-specific AI applications</p></li><li><p>The intersection of AI and quantum computing</p></li></ul><p>Whether you're a developer, entrepreneur, investor, or simply AI-curious, there's something here to expand your horizons and spark new ideas.</p><h4> Full Speaker Lineup (One-Stop Guide)&#65306;</h4><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.linkedin.com/pulse/your-one-stop-guide-genai-week-2025-speakers-linkedin-sophie-ren-afevc/?trackingId=DmRdpZcKBVNGDZP3gTZKvg%3D%3D&quot;,&quot;text&quot;:&quot;All speakers listed here:&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.linkedin.com/pulse/your-one-stop-guide-genai-week-2025-speakers-linkedin-sophie-ren-afevc/?trackingId=DmRdpZcKBVNGDZP3gTZKvg%3D%3D"><span>All speakers listed here:</span></a></p><h2>Coolest demo - Alef flying cars</h2><div id="youtube2-GB3Vz-mdzYw" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;GB3Vz-mdzYw&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/GB3Vz-mdzYw?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><h2>Global Networking on an Unparalleled Scale</h2><blockquote><p><em>"With attendees from over 30 countries, it'll be buzzing with energy, ideas, and opportunities." - Jimmy, Cofounder of GPTDAO</em></p></blockquote><p>The value of connections made at events like this cannot be overstated. GenAI Week 2025 brings together a truly global community of AI enthusiasts, practitioners, and visionaries. The potential for collaboration, investment, and career advancement is immense.</p><h2>Exclusive Discount for Our Readers</h2><p>We're excited to offer our TwoSetAI audience an exclusive 15% discount on tickets. Use coupon code TWOSETAI when registering through these links:</p><ul><li><p>Luma: <a href="https://lu.ma/genaiweek2025?coupon=TWOSETAI">https://lu.ma/genaiweek2025?coupon=TWOSETAI</a></p></li><li><p>Eventbrite: <a href="https://app.oscr.tech/www.eventbrite.com/e/1301648458579/?discount=TWOSETAI">www.eventbrite.com/e/1301648458579/?discount=TWOSETAI</a></p></li><li><p>QR codes for easy registration:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!uFHX!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F279c3fe6-aa12-4af8-a144-13b22b09c4ce_1600x1600.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!uFHX!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F279c3fe6-aa12-4af8-a144-13b22b09c4ce_1600x1600.png 424w, https://substackcdn.com/image/fetch/$s_!uFHX!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F279c3fe6-aa12-4af8-a144-13b22b09c4ce_1600x1600.png 848w, https://substackcdn.com/image/fetch/$s_!uFHX!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F279c3fe6-aa12-4af8-a144-13b22b09c4ce_1600x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!uFHX!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F279c3fe6-aa12-4af8-a144-13b22b09c4ce_1600x1600.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!uFHX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F279c3fe6-aa12-4af8-a144-13b22b09c4ce_1600x1600.png" width="1456" height="1456" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/279c3fe6-aa12-4af8-a144-13b22b09c4ce_1600x1600.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1456,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:260899,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://mlnotes.substack.com/i/167655184?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F279c3fe6-aa12-4af8-a144-13b22b09c4ce_1600x1600.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!uFHX!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F279c3fe6-aa12-4af8-a144-13b22b09c4ce_1600x1600.png 424w, https://substackcdn.com/image/fetch/$s_!uFHX!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F279c3fe6-aa12-4af8-a144-13b22b09c4ce_1600x1600.png 848w, https://substackcdn.com/image/fetch/$s_!uFHX!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F279c3fe6-aa12-4af8-a144-13b22b09c4ce_1600x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!uFHX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F279c3fe6-aa12-4af8-a144-13b22b09c4ce_1600x1600.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div></li></ul><h1>Limited spot free ticket!</h1><p>For our <strong><a href="https://mlnotes.substack.com/">MLNotes</a></strong> and <strong><a href="https://www.youtube.com/@TwoSetAI">TwoSetAI</a></strong> audience, we have several free tickets available. For a chance at a free ticket, email angelina@oscr.tech before 6pm PST on 7/6. Limited quantities available, so act fast! </p><h2>Hope to see you there!</h2><p><a href="https://lu.ma/genaiweek2025?coupon=TWOSETAI">Claim your spot now with our exclusive discount!</a></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!5ZMZ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F605efb1b-78e5-4183-aad0-15e9492b8d1e_480x270.gif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!5ZMZ!,w_424,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F605efb1b-78e5-4183-aad0-15e9492b8d1e_480x270.gif 424w, https://substackcdn.com/image/fetch/$s_!5ZMZ!,w_848,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F605efb1b-78e5-4183-aad0-15e9492b8d1e_480x270.gif 848w, https://substackcdn.com/image/fetch/$s_!5ZMZ!,w_1272,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F605efb1b-78e5-4183-aad0-15e9492b8d1e_480x270.gif 1272w, https://substackcdn.com/image/fetch/$s_!5ZMZ!,w_1456,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F605efb1b-78e5-4183-aad0-15e9492b8d1e_480x270.gif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!5ZMZ!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F605efb1b-78e5-4183-aad0-15e9492b8d1e_480x270.gif" width="480" height="270" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/605efb1b-78e5-4183-aad0-15e9492b8d1e_480x270.gif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:270,&quot;width&quot;:480,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!5ZMZ!,w_424,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F605efb1b-78e5-4183-aad0-15e9492b8d1e_480x270.gif 424w, https://substackcdn.com/image/fetch/$s_!5ZMZ!,w_848,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F605efb1b-78e5-4183-aad0-15e9492b8d1e_480x270.gif 848w, https://substackcdn.com/image/fetch/$s_!5ZMZ!,w_1272,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F605efb1b-78e5-4183-aad0-15e9492b8d1e_480x270.gif 1272w, https://substackcdn.com/image/fetch/$s_!5ZMZ!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F605efb1b-78e5-4183-aad0-15e9492b8d1e_480x270.gif 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div><hr></div><p>&#128736;&#65039;&#10024; Happy practicing and happy building! &#128640;&#127775; </p><p>Thanks for reading our newsletter. You can follow us here: Angelina<a href="https://www.linkedin.com/in/meetangelina/">&nbsp;Linkedin </a>or <a href="https://twitter.com/angelina_magr">Twitter</a><a href="https://twitter.com/Angelina_Magr"> </a>and Mehdi <a href="https://www.linkedin.com/in/mehdiallahyari/">Linkedin</a> or <a href="https://twitter.com/MehdiAllahyari">Twitter</a>.</p><p>Source of image: <br>SWR: https://milvus.io/docs/v2.5.x/assets/advanced_rag/sentence_window.png</p><p>&#127752; Our RAG course: https://maven.com/angelina-yang/mastering-rag-systems-a-hands-on-guide-to-production-ready-ai</p><p><a href="https://www.youtube.com/watch?v=6MK96ea-3LU&amp;t=0s">&nbsp;</a>&#128218; Also if you'd like to learn more about RAG systems, check out our book on the RAG system: You can download for free on the course site:<br><a href="https://maven.com/angelina-yang/mastering-rag-systems-a-hands-on-guide-to-production-ready-ai">https://maven.com/angelina-yang/mastering-rag-systems-a-hands-on-guide-to-production-ready-ai</a></p><p>&#129412; Any specific contents you wish to learn from us? Sign up here: https://noteforms.com/forms/twosetai-youtube-content-sqezrz </p><p>&#129520; Our video editing tool is this one!: https://get.descript.com/nf5cum9nj1m8 </p><p>&#128253;&#65039; Our RAG videos: <a href="https://www.youtube.com/@TwoSetAI">https://www.youtube.com/@TwoSetAI</a></p><p>&#128236; Don't miss out on the latest updates - Subscribe to our newsletter: </p><div class="embedded-publication-wrap" data-attrs="{&quot;id&quot;:857742,&quot;name&quot;:&quot;The MLnotes Newsletter&quot;,&quot;logo_url&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F88ea2641-8482-4fd4-95b6-f0d56d807c5c_1280x1280.png&quot;,&quot;base_url&quot;:&quot;https://mlnotes.substack.com&quot;,&quot;hero_text&quot;:&quot;MLnotes delivers bite-sized content covering diverse aspects of AI and ML, from real-world applications to entrepreneurship and careers in AI. Together, we can foster knowledge sharing and alleviate information anxiety amidst the rapid advancements in AI!&quot;,&quot;author_name&quot;:&quot;Mehdi Allahyari&quot;,&quot;show_subscribe&quot;:true,&quot;logo_bg_color&quot;:&quot;#ffffff&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="EmbeddedPublicationToDOMWithSubscribe"><div class="embedded-publication show-subscribe"><a class="embedded-publication-link-part" native="true" href="https://mlnotes.substack.com?utm_source=substack&amp;utm_campaign=publication_embed&amp;utm_medium=web"><img class="embedded-publication-logo" src="https://substackcdn.com/image/fetch/$s_!Kq64!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F88ea2641-8482-4fd4-95b6-f0d56d807c5c_1280x1280.png" width="56" height="56" style="background-color: rgb(255, 255, 255);"><span class="embedded-publication-name">The MLnotes Newsletter</span><div class="embedded-publication-hero-text">MLnotes delivers bite-sized content covering diverse aspects of AI and ML, from real-world applications to entrepreneurship and careers in AI. Together, we can foster knowledge sharing and alleviate information anxiety amidst the rapid advancements in AI!</div><div class="embedded-publication-author-name">By Mehdi Allahyari</div></a><form class="embedded-publication-subscribe" method="GET" action="https://mlnotes.substack.com/subscribe?"><input type="hidden" name="source" value="publication-embed"><input type="hidden" name="autoSubmit" value="true"><input type="email" class="email-input" name="email" placeholder="Type your email..."><input type="submit" class="button primary" value="Subscribe"></form></div></div><p></p><p></p>]]></content:encoded></item><item><title><![CDATA[Why SEO is Not Dead? 🤔]]></title><description><![CDATA[Insights from SEO Expert Sam Dunning]]></description><link>https://mlnotes.substack.com/p/why-seo-is-not-dead</link><guid isPermaLink="false">https://mlnotes.substack.com/p/why-seo-is-not-dead</guid><dc:creator><![CDATA[Angelina Yang]]></dc:creator><pubDate>Tue, 01 Jul 2025 01:30:22 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!kXCs!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd9a86f9-74e1-4871-bf81-7457f13199b3_3375x2750.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>I recently attended RB2B&#8217;s community learning session featuring SEO Expert <a href="https://www.linkedin.com/in/samdunning/">Sam Dunning</a>, which really blowed my mind. While everyone is talking about SEO is dead (which has been an ongoing topic for the past 2 years!), Sam was advocating for what SEO&#8217;s still worth today, and shared his playbook for next steps adapting your SEO to the age of AI. Hence, I reorganized my notes and sharing some of his insights here. </p><p>(He shared a backlink/SEO hack that he personally is using, which is totally thinking outside of the box, check it out below!) &#128071;<br>(He also shared deep insight around whether SEO is for you OR NOT! see below!) &#128071;</p><h2>How&#8217;s SEO evolving in the age of AI?</h2><p>"<em>I think there's a few factors. One we can get into is building a foundation about actually thinking about starting to rank on LLMs,</em>" says Sam Dunning, founder of Breaking B2B. </p><p>This statement captures the current state of SEO - </p><h4>it's not dying, it's evolving.</h4><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!kXCs!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd9a86f9-74e1-4871-bf81-7457f13199b3_3375x2750.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!kXCs!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd9a86f9-74e1-4871-bf81-7457f13199b3_3375x2750.png 424w, https://substackcdn.com/image/fetch/$s_!kXCs!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd9a86f9-74e1-4871-bf81-7457f13199b3_3375x2750.png 848w, https://substackcdn.com/image/fetch/$s_!kXCs!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd9a86f9-74e1-4871-bf81-7457f13199b3_3375x2750.png 1272w, https://substackcdn.com/image/fetch/$s_!kXCs!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd9a86f9-74e1-4871-bf81-7457f13199b3_3375x2750.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!kXCs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd9a86f9-74e1-4871-bf81-7457f13199b3_3375x2750.png" width="1456" height="1186" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/dd9a86f9-74e1-4871-bf81-7457f13199b3_3375x2750.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1186,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Is SEO Dead? Not Yet, But Search Is Changing&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Is SEO Dead? Not Yet, But Search Is Changing" title="Is SEO Dead? Not Yet, But Search Is Changing" srcset="https://substackcdn.com/image/fetch/$s_!kXCs!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd9a86f9-74e1-4871-bf81-7457f13199b3_3375x2750.png 424w, https://substackcdn.com/image/fetch/$s_!kXCs!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd9a86f9-74e1-4871-bf81-7457f13199b3_3375x2750.png 848w, https://substackcdn.com/image/fetch/$s_!kXCs!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd9a86f9-74e1-4871-bf81-7457f13199b3_3375x2750.png 1272w, https://substackcdn.com/image/fetch/$s_!kXCs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdd9a86f9-74e1-4871-bf81-7457f13199b3_3375x2750.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption"><a href="https://images.app.goo.gl/VfjjEVYwtDGcaLU77">Source</a></figcaption></figure></div><p>The rise of AI and Large Language Models (LLMs) has undoubtedly shaken up the world of search engine optimization. Some have even proclaimed the death of SEO as we know it. </p><p>However, the reality is far more nuanced. </p><p>SEO isn't dead; it's undergoing a transformation, adapting to the new landscape of AI-driven search.</p><p>Sam's insights reveal that while traditional SEO metrics like <strong>traffic</strong> are changing, with some clients seeing up to a <strong>20% decline in traffic</strong> from traditional search, new opportunities are emerging. </p><p>The key is to understand these changes and adapt our strategies accordingly.</p><h2>Google <strong>remains</strong> the undisputed king of search.</h2><p>Despite the buzz around AI search tools, Google <strong>remains</strong> the undisputed king of search. Sam points out, </p><div class="pullquote"><p>"Google latest report still gets 373 times more traffic than than, ChatGPT. And even if you combine all the AI search tools together, I think it's like two to 3% of the search market."</p></div><p>This statistic is crucial for SEO professionals and businesses alike. While it's important to prepare for the future of AI-driven search, we can't ignore the present reality. </p><p>Google is still the primary source of organic traffic for most websites, and optimizing for Google's algorithms remains a critical part of any comprehensive SEO strategy.</p><h2>Adapting SEO Strategies for AI-Driven Search</h2><p>While Google still dominates, the rise of AI search tools like ChatGPT can't be ignored. </p><p>Sam advises, </p><blockquote><p><em>"It's about building that foundation for LLMs. So not disregarding Google. Still need to work on it because the chances are when they fully roll out Google AI mode in a year or so (it&#8217;s already out!), it would likely take a lot of those ranking factors across, but build up a foundation to start ranking on chatGPT, and similar as well so you don't get kind of caught behind."</em></p></blockquote><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Ovwg!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3088bfc-31b7-4bb6-bf95-3bb2ec0b6971_1269x567.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Ovwg!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3088bfc-31b7-4bb6-bf95-3bb2ec0b6971_1269x567.png 424w, https://substackcdn.com/image/fetch/$s_!Ovwg!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3088bfc-31b7-4bb6-bf95-3bb2ec0b6971_1269x567.png 848w, https://substackcdn.com/image/fetch/$s_!Ovwg!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3088bfc-31b7-4bb6-bf95-3bb2ec0b6971_1269x567.png 1272w, https://substackcdn.com/image/fetch/$s_!Ovwg!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3088bfc-31b7-4bb6-bf95-3bb2ec0b6971_1269x567.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Ovwg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3088bfc-31b7-4bb6-bf95-3bb2ec0b6971_1269x567.png" width="1269" height="567" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e3088bfc-31b7-4bb6-bf95-3bb2ec0b6971_1269x567.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:567,&quot;width&quot;:1269,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:185372,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://mlnotes.substack.com/i/167210527?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3088bfc-31b7-4bb6-bf95-3bb2ec0b6971_1269x567.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Ovwg!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3088bfc-31b7-4bb6-bf95-3bb2ec0b6971_1269x567.png 424w, https://substackcdn.com/image/fetch/$s_!Ovwg!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3088bfc-31b7-4bb6-bf95-3bb2ec0b6971_1269x567.png 848w, https://substackcdn.com/image/fetch/$s_!Ovwg!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3088bfc-31b7-4bb6-bf95-3bb2ec0b6971_1269x567.png 1272w, https://substackcdn.com/image/fetch/$s_!Ovwg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3088bfc-31b7-4bb6-bf95-3bb2ec0b6971_1269x567.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This dual approach - optimizing for both traditional search engines and AI-powered tools - is likely to become the new norm in SEO.</p><p>It requires a deeper understanding of how AI interprets and ranks content, as well as a willingness to experiment with new strategies.</p><h2>Understanding your Customer before Content Creation is the key.</h2><p>One aspect of SEO that remains unchanged, regardless of the technology driving search, is the importance of <strong>understanding your audience.</strong> &#8212;&gt; This has never changed!</p><p>Sam emphasizes, </p><blockquote><p><em>"If the content you're putting out just ranks, but it doesn't actually resonate with your dream clients, expensive problems, motivations, jobs to be done, address their common questions, position your product as a painkiller, and show why you and it's almost a waste of time."</em></p></blockquote><p>This insight tells me that thorough <strong>customer research</strong> should be top priority in your SEO strategy. </p><p>It's not just about ranking for keywords; it's about creating content that truly resonates with your target audience's needs, pain points, and motivations. </p><p>This customer-centric approach not only helps with SEO but also improves conversion rates and overall marketing effectiveness.</p><h2>Technical SEO? Do your users understand your content?</h2><p>While technical SEO elements remain important, Sam's approach emphasizes the need to <strong>balance</strong> these with user-focused content. He states, </p><blockquote><p><em>"So many SEOs talk about the technical things, the step by step processes, but they forget that we're actually selling complex solutions to prospects that have jobs to be done, struggling moments, etcetera."</em></p></blockquote><p>This balance is crucial in the evolving SEO landscape. Technical optimizations help search engines understand and rank your content, but it's the user-focused elements that keep visitors engaged and encourage conversions. </p><p>As AI becomes more sophisticated in understanding user intent, this balance will likely become even more important.</p><h2>Backlinks and brand mentions still works&#8230;</h2><p>Despite the changes in search technology, some fundamental SEO principles remain crucial. Sam highlights the continued importance of backlinks and brand mentions, especially in the context of AI search. </p><p>He notes, </p><blockquote><p><em>"They evaluated 75,000 brands. And that evaluation, it showed that the ones that are ranking the most consistently on AI overviews were getting consistent brand mentions on external websites."</em></p></blockquote><p>This insight suggests that while the mechanisms of search are changing, the underlying principles of authority and relevance - as demonstrated through backlinks and brand mentions - continue to play a significant role. </p><p>It's a reminder that SEO isn't just about on-page optimization, but also about building a strong presence across the web.</p><h2>Strategies for Ranking in Both Traditional and AI-Driven Search</h2><p>Sam provides several strategies for optimizing content for both traditional and AI-driven search:</p>
      <p>
          <a href="https://mlnotes.substack.com/p/why-seo-is-not-dead">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[The Hottest New Programming Language is - English]]></title><description><![CDATA[Insights from Andrej Karpathy's recent keynote]]></description><link>https://mlnotes.substack.com/p/the-hottest-new-programming-language</link><guid isPermaLink="false">https://mlnotes.substack.com/p/the-hottest-new-programming-language</guid><pubDate>Wed, 25 Jun 2025 01:08:13 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!bMjN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a80d202-3833-471f-8e7d-d482f6bc0733_1178x663.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<blockquote><p><em>"Software is changing again, and I think it's changing quite fundamentally. I think roughly speaking, software has not changed much on such a fundamental level for 70 years, and then it's changed I think about twice quite rapidly in the last few years." - Andrej Karpathy</em></p></blockquote><p>These words from Andrej Karpathy, former director of AI at Tesla, set the stage for a profound transformation in the world of software development. His recent keynote talk at the Startup School in San Francisco is super inspiring. </p><p>As AI engineers, data scientists, and tech professionals, we find ourselves at the forefront of this revolution. But with great power comes great responsibility, and the specter of AI risk looms large over our exciting new frontier.</p><h2>The Evolution of Software: From 1.0 to 3.0 </h2><h3>Software 1.0: The Traditional Paradigm We All Know</h3><p>For decades, software development meant writing explicit instructions for computers in languages like C++, Python, or Java. Karpathy refers to this as "Software 1.0" &#8211; the foundation of our digital world.</p><h3>Software 2.0: The Rise of Neural Networks</h3><p>The advent of deep learning ushered in the era of "Software 2.0." Instead of writing explicit code, we began training neural networks, effectively programming through data and optimization algorithms. This shift marked a significant departure from traditional software development practices.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!bMjN!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a80d202-3833-471f-8e7d-d482f6bc0733_1178x663.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!bMjN!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a80d202-3833-471f-8e7d-d482f6bc0733_1178x663.png 424w, https://substackcdn.com/image/fetch/$s_!bMjN!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a80d202-3833-471f-8e7d-d482f6bc0733_1178x663.png 848w, https://substackcdn.com/image/fetch/$s_!bMjN!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a80d202-3833-471f-8e7d-d482f6bc0733_1178x663.png 1272w, https://substackcdn.com/image/fetch/$s_!bMjN!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a80d202-3833-471f-8e7d-d482f6bc0733_1178x663.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!bMjN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a80d202-3833-471f-8e7d-d482f6bc0733_1178x663.png" width="1178" height="663" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0a80d202-3833-471f-8e7d-d482f6bc0733_1178x663.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:663,&quot;width&quot;:1178,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:350165,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://mlnotes.substack.com/i/166746816?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a80d202-3833-471f-8e7d-d482f6bc0733_1178x663.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!bMjN!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a80d202-3833-471f-8e7d-d482f6bc0733_1178x663.png 424w, https://substackcdn.com/image/fetch/$s_!bMjN!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a80d202-3833-471f-8e7d-d482f6bc0733_1178x663.png 848w, https://substackcdn.com/image/fetch/$s_!bMjN!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a80d202-3833-471f-8e7d-d482f6bc0733_1178x663.png 1272w, https://substackcdn.com/image/fetch/$s_!bMjN!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0a80d202-3833-471f-8e7d-d482f6bc0733_1178x663.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Source: Andrej Karpathy</figcaption></figure></div><h3>Software 3.0: Programming in Natural Language</h3><p>Now, we stand at the precipice of "Software 3.0" &#8211; a paradigm where we program Large Language Models (LLMs) using natural language prompts. This revolutionary approach democratizes programming, but it also introduces new challenges and potential AI risks that we must carefully navigate.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!m-yX!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F917f4bba-5870-48d2-aa8c-4ceeeb6664fb_1179x666.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!m-yX!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F917f4bba-5870-48d2-aa8c-4ceeeb6664fb_1179x666.png 424w, https://substackcdn.com/image/fetch/$s_!m-yX!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F917f4bba-5870-48d2-aa8c-4ceeeb6664fb_1179x666.png 848w, https://substackcdn.com/image/fetch/$s_!m-yX!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F917f4bba-5870-48d2-aa8c-4ceeeb6664fb_1179x666.png 1272w, https://substackcdn.com/image/fetch/$s_!m-yX!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F917f4bba-5870-48d2-aa8c-4ceeeb6664fb_1179x666.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!m-yX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F917f4bba-5870-48d2-aa8c-4ceeeb6664fb_1179x666.png" width="1179" height="666" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/917f4bba-5870-48d2-aa8c-4ceeeb6664fb_1179x666.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:666,&quot;width&quot;:1179,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:195058,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://mlnotes.substack.com/i/166746816?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F917f4bba-5870-48d2-aa8c-4ceeeb6664fb_1179x666.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!m-yX!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F917f4bba-5870-48d2-aa8c-4ceeeb6664fb_1179x666.png 424w, https://substackcdn.com/image/fetch/$s_!m-yX!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F917f4bba-5870-48d2-aa8c-4ceeeb6664fb_1179x666.png 848w, https://substackcdn.com/image/fetch/$s_!m-yX!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F917f4bba-5870-48d2-aa8c-4ceeeb6664fb_1179x666.png 1272w, https://substackcdn.com/image/fetch/$s_!m-yX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F917f4bba-5870-48d2-aa8c-4ceeeb6664fb_1179x666.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Source: Andrej K</figcaption></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!50a1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff13cffd1-d390-4593-b7c2-76c5b61e7db4_1177x662.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!50a1!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff13cffd1-d390-4593-b7c2-76c5b61e7db4_1177x662.png 424w, https://substackcdn.com/image/fetch/$s_!50a1!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff13cffd1-d390-4593-b7c2-76c5b61e7db4_1177x662.png 848w, https://substackcdn.com/image/fetch/$s_!50a1!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff13cffd1-d390-4593-b7c2-76c5b61e7db4_1177x662.png 1272w, https://substackcdn.com/image/fetch/$s_!50a1!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff13cffd1-d390-4593-b7c2-76c5b61e7db4_1177x662.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!50a1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff13cffd1-d390-4593-b7c2-76c5b61e7db4_1177x662.png" width="1177" height="662" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f13cffd1-d390-4593-b7c2-76c5b61e7db4_1177x662.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:662,&quot;width&quot;:1177,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:170043,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://mlnotes.substack.com/i/166746816?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff13cffd1-d390-4593-b7c2-76c5b61e7db4_1177x662.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!50a1!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff13cffd1-d390-4593-b7c2-76c5b61e7db4_1177x662.png 424w, https://substackcdn.com/image/fetch/$s_!50a1!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff13cffd1-d390-4593-b7c2-76c5b61e7db4_1177x662.png 848w, https://substackcdn.com/image/fetch/$s_!50a1!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff13cffd1-d390-4593-b7c2-76c5b61e7db4_1177x662.png 1272w, https://substackcdn.com/image/fetch/$s_!50a1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff13cffd1-d390-4593-b7c2-76c5b61e7db4_1177x662.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Source: Andrej K</figcaption></figure></div><h2>LLMs: The New Operating Systems</h2><h3>Why LLMs Are More Than Just Another Tool</h3><p>Karpathy draws a compelling analogy between LLMs and operating systems:</p><blockquote><p><em>"LLMs don't only have properties of utilities. I think it's also fair to say that they have some properties of fabs<a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-1" href="#footnote-1" target="_self">1</a>, and the reason for this is that the capex</em><a class="footnote-anchor" data-component-name="FootnoteAnchorToDOM" id="footnote-anchor-2" href="#footnote-2" target="_self">2</a><em> required for building LLM is actually quite large."</em></p></blockquote><p>This perspective shifts our understanding of LLMs from mere tools to fundamental infrastructure. Just as operating systems provide a platform for applications, LLMs are becoming the foundation for a new generation of AI-powered software.</p><h3>The Utility-Like Nature of LLM Providers</h3><div class="pullquote"><p>AI is the new electricity. - Andrew Ng</p></div><p>LLM providers like OpenAI, Google (with Gemini), and Anthropic are emerging as utility-like entities. They invest heavily in infrastructure (akin to power plants) and offer metered access to their intelligence via APIs. This utility model introduces new considerations for AI safety and regulation, as we become increasingly dependent on these "intelligence grids."</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!sMJg!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F815e3fa8-7dd4-4760-a539-2f143e22c0e9_480x250.gif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!sMJg!,w_424,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F815e3fa8-7dd4-4760-a539-2f143e22c0e9_480x250.gif 424w, https://substackcdn.com/image/fetch/$s_!sMJg!,w_848,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F815e3fa8-7dd4-4760-a539-2f143e22c0e9_480x250.gif 848w, https://substackcdn.com/image/fetch/$s_!sMJg!,w_1272,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F815e3fa8-7dd4-4760-a539-2f143e22c0e9_480x250.gif 1272w, https://substackcdn.com/image/fetch/$s_!sMJg!,w_1456,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F815e3fa8-7dd4-4760-a539-2f143e22c0e9_480x250.gif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!sMJg!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F815e3fa8-7dd4-4760-a539-2f143e22c0e9_480x250.gif" width="480" height="250" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/815e3fa8-7dd4-4760-a539-2f143e22c0e9_480x250.gif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:250,&quot;width&quot;:480,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!sMJg!,w_424,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F815e3fa8-7dd4-4760-a539-2f143e22c0e9_480x250.gif 424w, https://substackcdn.com/image/fetch/$s_!sMJg!,w_848,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F815e3fa8-7dd4-4760-a539-2f143e22c0e9_480x250.gif 848w, https://substackcdn.com/image/fetch/$s_!sMJg!,w_1272,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F815e3fa8-7dd4-4760-a539-2f143e22c0e9_480x250.gif 1272w, https://substackcdn.com/image/fetch/$s_!sMJg!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F815e3fa8-7dd4-4760-a539-2f143e22c0e9_480x250.gif 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>The Psychology of LLMs: Understanding Our New Digital Colleagues</h2><h3>LLMs as "People Spirits"</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!jp8s!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b7aaeb5-9624-4041-8819-1888bbee66eb_1172x658.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!jp8s!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b7aaeb5-9624-4041-8819-1888bbee66eb_1172x658.png 424w, https://substackcdn.com/image/fetch/$s_!jp8s!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b7aaeb5-9624-4041-8819-1888bbee66eb_1172x658.png 848w, https://substackcdn.com/image/fetch/$s_!jp8s!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b7aaeb5-9624-4041-8819-1888bbee66eb_1172x658.png 1272w, https://substackcdn.com/image/fetch/$s_!jp8s!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b7aaeb5-9624-4041-8819-1888bbee66eb_1172x658.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!jp8s!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b7aaeb5-9624-4041-8819-1888bbee66eb_1172x658.png" width="1172" height="658" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6b7aaeb5-9624-4041-8819-1888bbee66eb_1172x658.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:658,&quot;width&quot;:1172,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:773127,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://mlnotes.substack.com/i/166746816?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b7aaeb5-9624-4041-8819-1888bbee66eb_1172x658.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!jp8s!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b7aaeb5-9624-4041-8819-1888bbee66eb_1172x658.png 424w, https://substackcdn.com/image/fetch/$s_!jp8s!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b7aaeb5-9624-4041-8819-1888bbee66eb_1172x658.png 848w, https://substackcdn.com/image/fetch/$s_!jp8s!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b7aaeb5-9624-4041-8819-1888bbee66eb_1172x658.png 1272w, https://substackcdn.com/image/fetch/$s_!jp8s!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b7aaeb5-9624-4041-8819-1888bbee66eb_1172x658.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Source: Andrej K</figcaption></figure></div><p>Karpathy introduces a fascinating concept: <strong>LLMs as "people spirits</strong>" &#8211; stochastic simulations of human-like intelligence. This anthropomorphic view helps us understand both the strengths and limitations of these systems.</p><p></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!lqSL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8118ba3-9454-40e0-b91d-04679eab921b_324x200.gif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!lqSL!,w_424,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8118ba3-9454-40e0-b91d-04679eab921b_324x200.gif 424w, https://substackcdn.com/image/fetch/$s_!lqSL!,w_848,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8118ba3-9454-40e0-b91d-04679eab921b_324x200.gif 848w, https://substackcdn.com/image/fetch/$s_!lqSL!,w_1272,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8118ba3-9454-40e0-b91d-04679eab921b_324x200.gif 1272w, https://substackcdn.com/image/fetch/$s_!lqSL!,w_1456,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8118ba3-9454-40e0-b91d-04679eab921b_324x200.gif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!lqSL!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8118ba3-9454-40e0-b91d-04679eab921b_324x200.gif" width="324" height="200" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c8118ba3-9454-40e0-b91d-04679eab921b_324x200.gif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:200,&quot;width&quot;:324,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Stan Marsh Ai GIF by South Park&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Stan Marsh Ai GIF by South Park" title="Stan Marsh Ai GIF by South Park" srcset="https://substackcdn.com/image/fetch/$s_!lqSL!,w_424,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8118ba3-9454-40e0-b91d-04679eab921b_324x200.gif 424w, https://substackcdn.com/image/fetch/$s_!lqSL!,w_848,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8118ba3-9454-40e0-b91d-04679eab921b_324x200.gif 848w, https://substackcdn.com/image/fetch/$s_!lqSL!,w_1272,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8118ba3-9454-40e0-b91d-04679eab921b_324x200.gif 1272w, https://substackcdn.com/image/fetch/$s_!lqSL!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8118ba3-9454-40e0-b91d-04679eab921b_324x200.gif 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><h3>Cognitive Quirks and Limitations</h3><p>While LLMs possess superhuman capabilities in certain areas, they also exhibit cognitive deficits:</p><ul><li><p>Hallucinations and confabulation</p></li><li><p>Jagged intelligence (excelling in some areas while failing at simple tasks)</p></li><li><p>Lack of persistent memory</p></li></ul><p>Understanding these limitations is crucial for mitigating AI risk and designing effective human-AI collaboration systems.</p><h2>Designing LLM Apps with Partial Autonomy </h2><h3>The Autonomy Slider: Finding the Right Balance</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!4kTQ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59ddf2ec-921c-4536-b2f3-5020da1bec24_1174x663.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!4kTQ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59ddf2ec-921c-4536-b2f3-5020da1bec24_1174x663.png 424w, https://substackcdn.com/image/fetch/$s_!4kTQ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59ddf2ec-921c-4536-b2f3-5020da1bec24_1174x663.png 848w, https://substackcdn.com/image/fetch/$s_!4kTQ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59ddf2ec-921c-4536-b2f3-5020da1bec24_1174x663.png 1272w, https://substackcdn.com/image/fetch/$s_!4kTQ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59ddf2ec-921c-4536-b2f3-5020da1bec24_1174x663.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!4kTQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59ddf2ec-921c-4536-b2f3-5020da1bec24_1174x663.png" width="1174" height="663" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/59ddf2ec-921c-4536-b2f3-5020da1bec24_1174x663.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:663,&quot;width&quot;:1174,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:267597,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://mlnotes.substack.com/i/166746816?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59ddf2ec-921c-4536-b2f3-5020da1bec24_1174x663.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!4kTQ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59ddf2ec-921c-4536-b2f3-5020da1bec24_1174x663.png 424w, https://substackcdn.com/image/fetch/$s_!4kTQ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59ddf2ec-921c-4536-b2f3-5020da1bec24_1174x663.png 848w, https://substackcdn.com/image/fetch/$s_!4kTQ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59ddf2ec-921c-4536-b2f3-5020da1bec24_1174x663.png 1272w, https://substackcdn.com/image/fetch/$s_!4kTQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59ddf2ec-921c-4536-b2f3-5020da1bec24_1174x663.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Karpathy introduces the concept of an <strong>"autonomy slider"</strong> in LLM applications. This allows users to control the level of AI involvement, from simple autocomplete to full-fledged autonomous agents. Striking the right balance is key to maximizing productivity while maintaining human oversight.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!OCwe!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9770e44d-af2a-4120-91f6-14d6409f5000_480x480.gif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!OCwe!,w_424,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9770e44d-af2a-4120-91f6-14d6409f5000_480x480.gif 424w, https://substackcdn.com/image/fetch/$s_!OCwe!,w_848,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9770e44d-af2a-4120-91f6-14d6409f5000_480x480.gif 848w, https://substackcdn.com/image/fetch/$s_!OCwe!,w_1272,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9770e44d-af2a-4120-91f6-14d6409f5000_480x480.gif 1272w, https://substackcdn.com/image/fetch/$s_!OCwe!,w_1456,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9770e44d-af2a-4120-91f6-14d6409f5000_480x480.gif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!OCwe!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9770e44d-af2a-4120-91f6-14d6409f5000_480x480.gif" width="480" height="480" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9770e44d-af2a-4120-91f6-14d6409f5000_480x480.gif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:480,&quot;width&quot;:480,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!OCwe!,w_424,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9770e44d-af2a-4120-91f6-14d6409f5000_480x480.gif 424w, https://substackcdn.com/image/fetch/$s_!OCwe!,w_848,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9770e44d-af2a-4120-91f6-14d6409f5000_480x480.gif 848w, https://substackcdn.com/image/fetch/$s_!OCwe!,w_1272,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9770e44d-af2a-4120-91f6-14d6409f5000_480x480.gif 1272w, https://substackcdn.com/image/fetch/$s_!OCwe!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9770e44d-af2a-4120-91f6-14d6409f5000_480x480.gif 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>We can Model After Cursor for AI-Assisted Coding</h3><p>The Cursor app exemplifies effective LLM integration in software development:</p><ol><li><p>Context management</p></li><li><p>Orchestration of multiple LLM calls</p></li><li><p>Application-specific GUI for easy human auditing</p></li><li><p>Flexible autonomy levels</p></li></ol><p>By studying successful applications like Cursor, we can derive best practices for designing LLM-powered tools that enhance productivity without compromising on AI safety.</p><h2>Human-AI Collaboration Loops is Super Important</h2><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!9qIH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5525f44-afad-45fc-90e0-431dd66794f1_480x270.gif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!9qIH!,w_424,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5525f44-afad-45fc-90e0-431dd66794f1_480x270.gif 424w, https://substackcdn.com/image/fetch/$s_!9qIH!,w_848,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5525f44-afad-45fc-90e0-431dd66794f1_480x270.gif 848w, https://substackcdn.com/image/fetch/$s_!9qIH!,w_1272,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5525f44-afad-45fc-90e0-431dd66794f1_480x270.gif 1272w, https://substackcdn.com/image/fetch/$s_!9qIH!,w_1456,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5525f44-afad-45fc-90e0-431dd66794f1_480x270.gif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!9qIH!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5525f44-afad-45fc-90e0-431dd66794f1_480x270.gif" width="480" height="270" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f5525f44-afad-45fc-90e0-431dd66794f1_480x270.gif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:270,&quot;width&quot;:480,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!9qIH!,w_424,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5525f44-afad-45fc-90e0-431dd66794f1_480x270.gif 424w, https://substackcdn.com/image/fetch/$s_!9qIH!,w_848,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5525f44-afad-45fc-90e0-431dd66794f1_480x270.gif 848w, https://substackcdn.com/image/fetch/$s_!9qIH!,w_1272,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5525f44-afad-45fc-90e0-431dd66794f1_480x270.gif 1272w, https://substackcdn.com/image/fetch/$s_!9qIH!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff5525f44-afad-45fc-90e0-431dd66794f1_480x270.gif 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>Speeding Up the Verification Process</h3><p>To truly leverage the power of LLMs, we need to optimize the human-AI collaboration loop. Karpathy emphasizes two key aspects:</p><ol><li><p>Speeding up verification through effective GUIs</p></li><li><p>Keeping AI "on a leash" to prevent overwhelming the human collaborator</p></li></ol><h3>Lessons from Tesla's Autopilot Development</h3><p>Karpathy shares insights from his experience with Tesla's Autopilot:</p><blockquote><p><em>"We did more and more autonomous tasks for the user, and maybe the story that I wanted to tell very briefly is actually the first time I drove a self-driving vehicle was in 2013... This drive was perfect. There were zero interventions, and this was 2013, which is now 12 years ago."</em></p></blockquote><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Gesc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2bafd88-7e69-42d8-9595-d779eb229595_1177x663.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Gesc!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2bafd88-7e69-42d8-9595-d779eb229595_1177x663.png 424w, https://substackcdn.com/image/fetch/$s_!Gesc!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2bafd88-7e69-42d8-9595-d779eb229595_1177x663.png 848w, https://substackcdn.com/image/fetch/$s_!Gesc!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2bafd88-7e69-42d8-9595-d779eb229595_1177x663.png 1272w, https://substackcdn.com/image/fetch/$s_!Gesc!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2bafd88-7e69-42d8-9595-d779eb229595_1177x663.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Gesc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2bafd88-7e69-42d8-9595-d779eb229595_1177x663.png" width="1177" height="663" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d2bafd88-7e69-42d8-9595-d779eb229595_1177x663.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:663,&quot;width&quot;:1177,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:584633,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://mlnotes.substack.com/i/166746816?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2bafd88-7e69-42d8-9595-d779eb229595_1177x663.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Gesc!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2bafd88-7e69-42d8-9595-d779eb229595_1177x663.png 424w, https://substackcdn.com/image/fetch/$s_!Gesc!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2bafd88-7e69-42d8-9595-d779eb229595_1177x663.png 848w, https://substackcdn.com/image/fetch/$s_!Gesc!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2bafd88-7e69-42d8-9595-d779eb229595_1177x663.png 1272w, https://substackcdn.com/image/fetch/$s_!Gesc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2bafd88-7e69-42d8-9595-d779eb229595_1177x663.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Source: Andrej K</figcaption></figure></div><p>This anecdote illustrates the long journey from impressive demos to real-world deployment, highlighting the need for patience and rigorous testing in AI development.</p><h2>Democratizing Programming in the AI Era</h2><h3>Everyone Is Now a Programmer</h3><p>The advent of natural language programming through LLMs has democratized software development. Karpathy coins the term "<strong>vibe coding</strong>" to describe this phenomenon:</p><blockquote><p><em>"Suddenly, everyone is a programmer because everyone speaks natural language like English. This is extremely bullish and very interesting to me and also completely unprecedented."</em></p></blockquote><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!QqkX!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1157592a-6a58-4e40-864a-9d4bf1485172_736x520.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!QqkX!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1157592a-6a58-4e40-864a-9d4bf1485172_736x520.png 424w, https://substackcdn.com/image/fetch/$s_!QqkX!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1157592a-6a58-4e40-864a-9d4bf1485172_736x520.png 848w, https://substackcdn.com/image/fetch/$s_!QqkX!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1157592a-6a58-4e40-864a-9d4bf1485172_736x520.png 1272w, https://substackcdn.com/image/fetch/$s_!QqkX!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1157592a-6a58-4e40-864a-9d4bf1485172_736x520.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!QqkX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1157592a-6a58-4e40-864a-9d4bf1485172_736x520.png" width="736" height="520" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1157592a-6a58-4e40-864a-9d4bf1485172_736x520.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:520,&quot;width&quot;:736,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:270196,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://mlnotes.substack.com/i/166746816?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1157592a-6a58-4e40-864a-9d4bf1485172_736x520.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!QqkX!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1157592a-6a58-4e40-864a-9d4bf1485172_736x520.png 424w, https://substackcdn.com/image/fetch/$s_!QqkX!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1157592a-6a58-4e40-864a-9d4bf1485172_736x520.png 848w, https://substackcdn.com/image/fetch/$s_!QqkX!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1157592a-6a58-4e40-864a-9d4bf1485172_736x520.png 1272w, https://substackcdn.com/image/fetch/$s_!QqkX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1157592a-6a58-4e40-864a-9d4bf1485172_736x520.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>Building a Custom iOS App Without Swift Knowledge</h3><p>Karpathy shares his experience of creating an iOS app using vibe coding:</p><blockquote><p><em>"I built this iOS app, and I don't actually know how to program in Swift, but I was really shocked that I was able to build like a super basic app... This was just like a day of work, and this was running on my phone like later that day."</em></p></blockquote><p>This example demonstrates the transformative potential of LLM-assisted programming, but it also raises questions about job security for traditional developers and the need for new skills in the AI era.</p><h2>Building for Agents: Creating Future-Ready Digital Infrastructure </h2><h3>Making Our Digital World LLM-Friendly</h3><p>As LLMs become more prevalent, we need to adapt our digital infrastructure to be more "LLM-friendly." Karpathy suggests several approaches:</p>
      <p>
          <a href="https://mlnotes.substack.com/p/the-hottest-new-programming-language">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[How to Build a Personalized AI Telegram Newsbot: A Step-by-Step Guide]]></title><description><![CDATA[Let&#8217;s admit it, staying informed can be overwhelming.]]></description><link>https://mlnotes.substack.com/p/how-to-build-a-personalized-ai-telegram</link><guid isPermaLink="false">https://mlnotes.substack.com/p/how-to-build-a-personalized-ai-telegram</guid><dc:creator><![CDATA[Angelina Yang]]></dc:creator><pubDate>Mon, 02 Jun 2025 00:43:32 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!a0LM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65e6a5aa-23cb-4b27-9653-eadafb15228c_480x368.gif" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Let&#8217;s admit it, staying informed can be overwhelming. </p><p>But doing so effectively&#8212;and in a way that suits your preferences&#8230;</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!a0LM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65e6a5aa-23cb-4b27-9653-eadafb15228c_480x368.gif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!a0LM!,w_424,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65e6a5aa-23cb-4b27-9653-eadafb15228c_480x368.gif 424w, https://substackcdn.com/image/fetch/$s_!a0LM!,w_848,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65e6a5aa-23cb-4b27-9653-eadafb15228c_480x368.gif 848w, https://substackcdn.com/image/fetch/$s_!a0LM!,w_1272,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65e6a5aa-23cb-4b27-9653-eadafb15228c_480x368.gif 1272w, https://substackcdn.com/image/fetch/$s_!a0LM!,w_1456,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65e6a5aa-23cb-4b27-9653-eadafb15228c_480x368.gif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!a0LM!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65e6a5aa-23cb-4b27-9653-eadafb15228c_480x368.gif" width="480" height="368" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/65e6a5aa-23cb-4b27-9653-eadafb15228c_480x368.gif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:368,&quot;width&quot;:480,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!a0LM!,w_424,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65e6a5aa-23cb-4b27-9653-eadafb15228c_480x368.gif 424w, https://substackcdn.com/image/fetch/$s_!a0LM!,w_848,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65e6a5aa-23cb-4b27-9653-eadafb15228c_480x368.gif 848w, https://substackcdn.com/image/fetch/$s_!a0LM!,w_1272,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65e6a5aa-23cb-4b27-9653-eadafb15228c_480x368.gif 1272w, https://substackcdn.com/image/fetch/$s_!a0LM!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65e6a5aa-23cb-4b27-9653-eadafb15228c_480x368.gif 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>We're bombarded with news from countless sources, often cluttered with ads and irrelevant content. As media theorist <strong><a href="https://en.wikipedia.org/wiki/Clay_Shirky">Clay Shirky</a></strong> points out,</p><div class="pullquote"><p>"It's not information overload. It's filter failure." </p></div><p>But what if there was a way to curate you&#8230;</p>
      <p>
          <a href="https://mlnotes.substack.com/p/how-to-build-a-personalized-ai-telegram">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[Branding Masterclass for AI Founders: Why Efficiency Isn't Enough]]></title><description><![CDATA["You could stop a thousand people on the street and ask them what they want in life and no one is going to say, 'I really wish I was more efficient.'" - Cara Smith]]></description><link>https://mlnotes.substack.com/p/branding-masterclass-for-ai-founders</link><guid isPermaLink="false">https://mlnotes.substack.com/p/branding-masterclass-for-ai-founders</guid><dc:creator><![CDATA[Angelina Yang]]></dc:creator><pubDate>Wed, 14 May 2025 01:11:00 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!SEqx!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F685e401e-a3de-4302-a8a2-a7429c6a5445_1512x844.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="pullquote"><p>"You could stop a thousand people on the street and ask them what they want in life and no one is going to say, 'I really wish I was more efficient.'" - Cara Smith</p></div><p>Today, founders are racing to create the next groundbreaking product. </p><p>But amidst the frenzy of development and feature optimization, many are falling into a dangerous trap: <strong>the efficiency obsession.</strong> </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!SEqx!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F685e401e-a3de-4302-a8a2-a7429c6a5445_1512x844.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!SEqx!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F685e401e-a3de-4302-a8a2-a7429c6a5445_1512x844.png 424w, https://substackcdn.com/image/fetch/$s_!SEqx!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F685e401e-a3de-4302-a8a2-a7429c6a5445_1512x844.png 848w, https://substackcdn.com/image/fetch/$s_!SEqx!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F685e401e-a3de-4302-a8a2-a7429c6a5445_1512x844.png 1272w, https://substackcdn.com/image/fetch/$s_!SEqx!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F685e401e-a3de-4302-a8a2-a7429c6a5445_1512x844.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!SEqx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F685e401e-a3de-4302-a8a2-a7429c6a5445_1512x844.png" width="1456" height="813" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/685e401e-a3de-4302-a8a2-a7429c6a5445_1512x844.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:813,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:429596,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://mlnotes.substack.com/i/163446358?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F685e401e-a3de-4302-a8a2-a7429c6a5445_1512x844.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!SEqx!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F685e401e-a3de-4302-a8a2-a7429c6a5445_1512x844.png 424w, https://substackcdn.com/image/fetch/$s_!SEqx!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F685e401e-a3de-4302-a8a2-a7429c6a5445_1512x844.png 848w, https://substackcdn.com/image/fetch/$s_!SEqx!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F685e401e-a3de-4302-a8a2-a7429c6a5445_1512x844.png 1272w, https://substackcdn.com/image/fetch/$s_!SEqx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F685e401e-a3de-4302-a8a2-a7429c6a5445_1512x844.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Just an example</figcaption></figure></div><p>What surprised me was how often, this focus on productivity gains can be myopic and limiting, moreover, it's potentially fatal for your brand. Here&#8217;s what I&#8217;ve learned why <strong>efficiency alone</strong> won't cut it, and how AI founders can craft brands that truly resonate with their audience.</p><h2>People Buy Feelings, Not Features</h2><p>When it comes to building a brand that sticks, <strong>emotion</strong> is your secret weapon. Cara Smith, co-founder of branding agency <em>Smith and Diction</em>, puts it bluntly: </p><blockquote><p><em>"To me, the main difference between fine copy and good copy is what's the emotion? What's the feeling that's behind it?" </em></p></blockquote><p>This insight is crucial for AI founders who often get caught up in the technical capabilities of their products.</p><p>Consider this: </p><p>while your AI might be able to process tasks 10x faster than a human, that fact alone won't make someone fall in love with your brand (Although Garry from YCombinator would say that&#8217;s a must have). </p><p>People don't wake up in the morning dreaming about being more efficient. They dream about feeling empowered, creative, or connected. Your branding needs to tap into these deeper human desires.</p><h2>Your Brand Is More Than Just Your Logo, it's How People See You</h2><blockquote><p><em>"Everybody has a brand. Brand is simply how people see you," </em></p></blockquote><p>This perspective shift is vital for AI founders who might be tempted to think branding is just about creating a sleek logo or choosing the right color palette.</p><p>Your brand encompasses every interaction a person has with your company. It's the tone of your customer service emails, the user experience of your app, and yes, even the efficiency of your AI. But it's also about the emotions you evoke and the stories you tell.</p><p>For AI founders, this means considering how your technology makes people feel, not just what it can do.</p><h2>Crafting a Verbal Identity That Resonates Beyond Efficiency</h2><p>One of the most powerful tools in your branding arsenal is your <strong>verbal identity</strong>. </p><blockquote><p><em>"Verbal identity is once you cross that line between talking to your internal audience or maybe your partners and talking to your external audience, right? So that's what is your website going to say? What's the tone that you use?"</em></p></blockquote><p>This goes far beyond a catchy tagline. It's about developing a consistent voice and personality that speaks directly to your audience's needs and aspirations.</p><p>Take Gamma, an AI-powered presentation tool. Instead of focusing solely on how quickly it can create slides, their branding revolves around the idea of "<em>get your ideas out there.</em>" </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Paa6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3fe59317-79aa-4aa6-a22f-99a56ddc3df6_2770x1188.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Paa6!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3fe59317-79aa-4aa6-a22f-99a56ddc3df6_2770x1188.png 424w, https://substackcdn.com/image/fetch/$s_!Paa6!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3fe59317-79aa-4aa6-a22f-99a56ddc3df6_2770x1188.png 848w, https://substackcdn.com/image/fetch/$s_!Paa6!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3fe59317-79aa-4aa6-a22f-99a56ddc3df6_2770x1188.png 1272w, https://substackcdn.com/image/fetch/$s_!Paa6!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3fe59317-79aa-4aa6-a22f-99a56ddc3df6_2770x1188.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Paa6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3fe59317-79aa-4aa6-a22f-99a56ddc3df6_2770x1188.png" width="1456" height="624" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3fe59317-79aa-4aa6-a22f-99a56ddc3df6_2770x1188.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:624,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2380584,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://mlnotes.substack.com/i/163446358?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3fe59317-79aa-4aa6-a22f-99a56ddc3df6_2770x1188.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Paa6!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3fe59317-79aa-4aa6-a22f-99a56ddc3df6_2770x1188.png 424w, https://substackcdn.com/image/fetch/$s_!Paa6!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3fe59317-79aa-4aa6-a22f-99a56ddc3df6_2770x1188.png 848w, https://substackcdn.com/image/fetch/$s_!Paa6!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3fe59317-79aa-4aa6-a22f-99a56ddc3df6_2770x1188.png 1272w, https://substackcdn.com/image/fetch/$s_!Paa6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3fe59317-79aa-4aa6-a22f-99a56ddc3df6_2770x1188.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This speaks to a deeper desire of their target audience &#8211; educators and non-designers who have valuable ideas but struggle with the technical aspects of presentation creation.</p><p>Developing a strong verbal identity allows you to maintain a consistent voice across all platforms, helping to build trust and recognition with your audience. </p><p>For AI companies, this might mean finding ways to humanize complex technology or articulating your ethical stance on AI development.</p><h2>Moving Beyond Generic AI Aesthetics and Use Design That Tells a Story </h2><blockquote><p>"<em>I always just try to make sure that our logos have like a 10% strange thing about them,</em>" says Mike Smith, discussing his approach to logo design. </p></blockquote><p>In a sea of generic AI company logos featuring abstract geometric shapes or glowing orbs, this philosophy of intentional uniqueness can set you apart.</p><p>Your visual identity should tell a story about who you are and what you stand for. It's not just about looking sleek or futuristic, but what&#8217;s the story behind it. </p><p>Take the example of Perplexity AI. Their verbal identity focused on the concept of "where knowledge begins," positioning their product not just as an answer machine, but as a gateway to deeper understanding.</p><p>As its designer explained:</p><blockquote><p><em>"Perplexity is now your starting point and then you can click any of those citation links and go anywhere you want to go."</em></p></blockquote><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!B8z4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb54a5f2-5bec-4a0b-8e08-671a9a63f1e4_300x168.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!B8z4!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb54a5f2-5bec-4a0b-8e08-671a9a63f1e4_300x168.png 424w, https://substackcdn.com/image/fetch/$s_!B8z4!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb54a5f2-5bec-4a0b-8e08-671a9a63f1e4_300x168.png 848w, https://substackcdn.com/image/fetch/$s_!B8z4!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb54a5f2-5bec-4a0b-8e08-671a9a63f1e4_300x168.png 1272w, https://substackcdn.com/image/fetch/$s_!B8z4!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb54a5f2-5bec-4a0b-8e08-671a9a63f1e4_300x168.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!B8z4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb54a5f2-5bec-4a0b-8e08-671a9a63f1e4_300x168.png" width="300" height="168" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/eb54a5f2-5bec-4a0b-8e08-671a9a63f1e4_300x168.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:168,&quot;width&quot;:300,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Perplexity AI launches its own Deep ...&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Perplexity AI launches its own Deep ..." title="Perplexity AI launches its own Deep ..." srcset="https://substackcdn.com/image/fetch/$s_!B8z4!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb54a5f2-5bec-4a0b-8e08-671a9a63f1e4_300x168.png 424w, https://substackcdn.com/image/fetch/$s_!B8z4!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb54a5f2-5bec-4a0b-8e08-671a9a63f1e4_300x168.png 848w, https://substackcdn.com/image/fetch/$s_!B8z4!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb54a5f2-5bec-4a0b-8e08-671a9a63f1e4_300x168.png 1272w, https://substackcdn.com/image/fetch/$s_!B8z4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feb54a5f2-5bec-4a0b-8e08-671a9a63f1e4_300x168.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>And that&#8217;s why their logo is a spinning door visually represents their mission of opening doors to knowledge. </p><p>For AI founders, this might mean moving beyond clich&#233; tech imagery and finding visual metaphors that represent the unique value your AI brings to users' lives. </p><blockquote><p><em> "As long as the logo can kind of tell a story and like convey an emotion or convey a strategy or something like that, that's when we know we're on to something."</em></p></blockquote><h2>Understanding Your Audience: The Key to Authentic Connection</h2><p>Creating a brand that resonates requires a deep understanding of your audience. This goes beyond basic demographics. You need to get inside their heads, understand their daily lives, their aspirations, and their pain points.</p><blockquote><p> <em>"The more time you just really spend treating your audience like almost like you're in a novel or you're in a movie and you're putting yourself in the shoes of that character, then that allows you to really think, okay, how do I talk to them in a way that's going to matter to them, not necessarily matter to me."</em></p></blockquote><p>We talked about the observation of focusing too much on efficiency being a common pitfall for AI founders. As Cara puts it,</p><blockquote><p>"I see a ton of ads, especially in the AI space, that are all about efficiency and I just think it's ridiculous," </p></blockquote><p>Therefore, instead of focusing solely on how your AI can make processes more efficient, ask yourself these critical questions:</p>
      <p>
          <a href="https://mlnotes.substack.com/p/branding-masterclass-for-ai-founders">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[Is Your Career AI-Ready for 2025? ]]></title><description><![CDATA[Insights from CTO of AI HRtech]]></description><link>https://mlnotes.substack.com/p/is-your-career-ai-ready-for-2025</link><guid isPermaLink="false">https://mlnotes.substack.com/p/is-your-career-ai-ready-for-2025</guid><dc:creator><![CDATA[Angelina Yang]]></dc:creator><pubDate>Fri, 09 May 2025 03:41:56 GMT</pubDate><enclosure url="https://substackcdn.com/image/youtube/w_728,c_limit/d4eYzmXhNV8" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p></p><div class="pullquote"><p><strong>&#8220;AI is definitely impacting the job market</strong>.&#8221;</p></div><p>As much as we don&#8217;t want to believe that AI is impacting us, </p><h3>it is. </h3><p>I&#8217;m floored with what Jerry said and in the meantime not so surprised. </p><p>This year to next year, the question on our mind should be: </p><h4><strong>&#8220;Is my career AI-ready?&#8221;</strong></h4><p>To answer this, we turn to Jerry Wang, a visionary in the AI-powered hiring space, for h&#8230;</p>
      <p>
          <a href="https://mlnotes.substack.com/p/is-your-career-ai-ready-for-2025">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[AI is Rewriting the Rules of Pricing – And You Need to Pay Attention]]></title><description><![CDATA[Insights from Manny Medina, Founder of Paid]]></description><link>https://mlnotes.substack.com/p/ai-is-rewriting-the-rules-of-pricing</link><guid isPermaLink="false">https://mlnotes.substack.com/p/ai-is-rewriting-the-rules-of-pricing</guid><dc:creator><![CDATA[Angelina Yang]]></dc:creator><pubDate>Tue, 06 May 2025 02:04:23 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!tolb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21378061-2ddd-4f48-9284-103e5bfefb75_1024x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p></p><div class="pullquote"><p><strong>"Your customer will always default to the easiest way to buy which is either - some kind of fixed price or a consumption price for the first year to see if it works.</strong> </p></div><p>Gone are the days when AI companies could simply charge by the token or API call. The market is maturing, and with it, the expectations of customers are evolving. </p><p>As Manny Medina, founder of Paid, puts it:</p><blockquote><p><em>&#8220;if it does work, it is up to the AI agent builder and creator to go back to the same customer and say, 'Let's align on things that are important to you and charge for it.'"</em></p></blockquote><p>This statement encapsulates the core of what&#8217;s happening in <strong>pricing</strong> for AI products: <strong>a move from charging for inputs to charging for outcomes</strong>. </p><p>But why is this shift happening, and what does it mean for the AI ecosystem?</p><h2>The Four Pillars of AI Pricing: A New Framework for Value</h2><p>There are four main approaches to pricing that are currently working in the AI space:</p><ol><li><p><strong>Activity-based pricing</strong></p></li><li><p><strong>Workflow-based pricing</strong></p></li><li><p><strong>Outcome-based pricing</strong></p></li><li><p><strong>Agent-based pricing</strong></p></li></ol><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!tolb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21378061-2ddd-4f48-9284-103e5bfefb75_1024x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!tolb!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21378061-2ddd-4f48-9284-103e5bfefb75_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!tolb!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21378061-2ddd-4f48-9284-103e5bfefb75_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!tolb!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21378061-2ddd-4f48-9284-103e5bfefb75_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!tolb!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21378061-2ddd-4f48-9284-103e5bfefb75_1024x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!tolb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21378061-2ddd-4f48-9284-103e5bfefb75_1024x1024.png" width="1024" height="1024" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/21378061-2ddd-4f48-9284-103e5bfefb75_1024x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1024,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Generated image&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Generated image" title="Generated image" srcset="https://substackcdn.com/image/fetch/$s_!tolb!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21378061-2ddd-4f48-9284-103e5bfefb75_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!tolb!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21378061-2ddd-4f48-9284-103e5bfefb75_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!tolb!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21378061-2ddd-4f48-9284-103e5bfefb75_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!tolb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F21378061-2ddd-4f48-9284-103e5bfefb75_1024x1024.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Each of these approaches represents a step towards more sophisticated, value-aligned pricing models. Let's break them down:</p><h3>Activity-Based Pricing: The Starting Point</h3><p>Activity-based pricing is the most straightforward approach. It's essentially a credit consumption model where customers pay for the specific actions or computations performed by the AI. While this model is easy to understand and implement, it's also the most vulnerable to commoditization.</p><p>Medina warns: </p><blockquote><p><em>"If you stay there somebody will come along and say I'll do the same thing for cheaper. Yeah. And then you are in a nightmare scenario in which there is tons of others who look just like you."</em></p></blockquote><h3>Workflow-Based Pricing: Stepping Towards Value</h3><p>Workflow-based pricing involves charging for a series of connected activities that deliver a specific result. </p><p>This model begins to align pricing with the value delivered to the customer. As Medina explains:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Bepf!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9d63168-1a67-41b9-ac5e-4a01f1d676f0_1024x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Bepf!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9d63168-1a67-41b9-ac5e-4a01f1d676f0_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!Bepf!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9d63168-1a67-41b9-ac5e-4a01f1d676f0_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!Bepf!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9d63168-1a67-41b9-ac5e-4a01f1d676f0_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!Bepf!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9d63168-1a67-41b9-ac5e-4a01f1d676f0_1024x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Bepf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9d63168-1a67-41b9-ac5e-4a01f1d676f0_1024x1024.png" width="1024" height="1024" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e9d63168-1a67-41b9-ac5e-4a01f1d676f0_1024x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1024,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Generated image&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Generated image" title="Generated image" srcset="https://substackcdn.com/image/fetch/$s_!Bepf!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9d63168-1a67-41b9-ac5e-4a01f1d676f0_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!Bepf!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9d63168-1a67-41b9-ac5e-4a01f1d676f0_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!Bepf!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9d63168-1a67-41b9-ac5e-4a01f1d676f0_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!Bepf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9d63168-1a67-41b9-ac5e-4a01f1d676f0_1024x1024.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><blockquote><p><em>"When you string a number of activities and you say this workflow is cost this much, like a document review, right? Because then you can separate documents that are small from documents that are long and complicated because they have different consumption patterns and it feels like you're getting closer to value based pricing as opposed to costbased pricing."</em></p></blockquote><h3>Outcome-Based Pricing: Aligning Incentives</h3><p>Outcome-based pricing is where things get really interesting. Instead of charging for the work done, companies charge based on the results achieved. This could be in the form of a bonus for reaching certain performance metrics or a fee structure tied directly to customer outcomes.</p><p>Medina suggests an innovative approach: </p><blockquote><p><em>"What I I've been recommending to my customer, it's not out there yet, but I'm going to give it a push, is to charge instead of charging per outcome, get an outcome bonus."</em></p></blockquote><p>This model creates a powerful alignment between the AI provider and the customer, ensuring that both parties are invested in achieving meaningful results.</p><h3>Agent-Based Pricing: The Future of AI Workforce</h3><p>Agent-based pricing is perhaps <strong>the most forward-thinking model.</strong> </p><p>It involves charging for AI agents as if they were human employees, based on the work they perform and the outcomes they achieve.</p><p>Medina elaborates: </p><blockquote><p><em>"You can pay instead of saying a platform fee say like you know I'm going to deploy x many agents the agent is going to do this amount of work that is equivalent of a $90,000 a year uh sr I'm going to charge you 20,000 per agent the agent is going to deliver this much work and you can pay me a bonus for the number of meetings booked."</em></p></blockquote><p>This approach allows AI companies to tap into HR budgets rather than competing for limited software budgets, potentially opening up larger revenue streams.</p><h2>Lack of visibility into unit economics is a ticking time bomb for many AI startups. Why?</h2>
      <p>
          <a href="https://mlnotes.substack.com/p/ai-is-rewriting-the-rules-of-pricing">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[RAG’s Hidden Power-Ups for 2025: Sentence Window Retrieval, Meta-data Filtering and More]]></title><description><![CDATA[- Part 2]]></description><link>https://mlnotes.substack.com/p/rags-hidden-power-ups-for-2025-sentence</link><guid isPermaLink="false">https://mlnotes.substack.com/p/rags-hidden-power-ups-for-2025-sentence</guid><dc:creator><![CDATA[Angelina Yang]]></dc:creator><pubDate>Thu, 01 May 2025 04:50:29 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!wN79!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F934ce720-1e8f-43a2-9b41-f066f4aef31d_1766x1060.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Last week, we&#8217;ve introduced <strong>query enhancement techniques</strong> and <strong>indexing enhancement techniques</strong> to power up your vanilla RAG system. (You can revisit the details below!)</p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;b2c45847-50f3-4e1f-8d77-fbebe5a6de6d&quot;,&quot;caption&quot;:&quot;As you may know already,&quot;,&quot;cta&quot;:null,&quot;showBylines&quot;:true,&quot;showDescription&quot;:true,&quot;showImage&quot;:true,&quot;size&quot;:&quot;lg&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;Fake It Till You Make It: HyDE, Step-Back Prompts, Hybrid Search, and More&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:70767577,&quot;name&quot;:&quot;Angelina Yang&quot;,&quot;bio&quot;:&quot;Co-founder at Oscr AI&quot;,&quot;photo_url&quot;:&quot;https://bucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com/public/images/235cf3f4-9e0a-471f-b1ec-234f3e808f9f_1024x1024.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2025-04-17T05:26:02.317Z&quot;,&quot;cover_image&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce45a431-9580-40c1-8ed5-042f84fa654e_1914x1150.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://mlnotes.substack.com/p/fake-it-till-you-make-it-hyde-step&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:161515483,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:0,&quot;comment_count&quot;:0,&quot;publication_id&quot;:null,&quot;publication_name&quot;:&quot;The MLnotes Newsletter&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F88ea2641-8482-4fd4-95b6-f0d56d807c5c_1280x1280.png&quot;,&quot;belowTheFold&quot;:false,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><p>This week, we&#8217;ll explore another set of techniques that includes retriever, generator, and general pipeline enhancement. </p><h2>Why Your AI Might Be Suffering from "Middle Child Syndrome"</h2><p>Remember &#8230;</p>
      <p>
          <a href="https://mlnotes.substack.com/p/rags-hidden-power-ups-for-2025-sentence">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA["Creativity Is Doing More Than The First Thing You Think Of"]]></title><description><![CDATA[Rethink how we collaborate with AI]]></description><link>https://mlnotes.substack.com/p/creativity-is-doing-more-than-the</link><guid isPermaLink="false">https://mlnotes.substack.com/p/creativity-is-doing-more-than-the</guid><dc:creator><![CDATA[Angelina Yang]]></dc:creator><pubDate>Tue, 29 Apr 2025 01:30:20 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!_mIx!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9d231fc6-15e8-4d60-bbc1-d735af7d52d6_2566x1302.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>A seventh-grader in Ohio once wrote on a post-it note: </p><div class="pullquote"><p><strong>"Creativity is doing more than the first thing you think of."</strong> </p></div><p>It&#8217;s probably my favorite definition of creativity &#8212; simple, profound, and surprisingly relevant today.</p><p>At countless AI events I&#8217;ve attended, one question always comes up:</p><p><strong>&#8220;Will AI replace us?&#8221;</strong></p><p>My answer has always been: <strong>&#8220;No. AI will make us all creators.&#8221;</strong></p><p>But recently, after reading into researches on AI&#8217;s impact on human creativity,  and seeing firsthand how people are actually using these tools, I&#8217;ve started to rethink not just what creativity means, but how we can <em>outgrow</em> the older versions of ourselves <em>with</em> AI, instead of fearing being replaced by it.</p><p>In a world shaped more and more by artificial intelligence, it&#8217;s never been more important, (or more inspiring) to learn how to collaborate with AI, and to use it wisely as we move into the future.</p><h2>The Churchill Conundrum: From Bathtub Brilliance to AI Assistance</h2><p>Imagine Winston Churchill, lounging in his bathtub, dictating a national address to his assistant in the next room. He shouts corrections and refinements, pushing beyond his initial thoughts to craft a message that will resonate with the nation. <a href="https://youtube.com/clip/UgkxsdlR-_TnN_M6MzAMdv0AsvXVBZGCtUc7?si=4aBFRrysuSMpBP9f">This scene</a>, immortalized in Albert Finney's portrayal, represents a level of creative collaboration that was once reserved for the privileged few.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!_mIx!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9d231fc6-15e8-4d60-bbc1-d735af7d52d6_2566x1302.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!_mIx!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9d231fc6-15e8-4d60-bbc1-d735af7d52d6_2566x1302.png 424w, https://substackcdn.com/image/fetch/$s_!_mIx!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9d231fc6-15e8-4d60-bbc1-d735af7d52d6_2566x1302.png 848w, https://substackcdn.com/image/fetch/$s_!_mIx!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9d231fc6-15e8-4d60-bbc1-d735af7d52d6_2566x1302.png 1272w, https://substackcdn.com/image/fetch/$s_!_mIx!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9d231fc6-15e8-4d60-bbc1-d735af7d52d6_2566x1302.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!_mIx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9d231fc6-15e8-4d60-bbc1-d735af7d52d6_2566x1302.png" width="1456" height="739" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9d231fc6-15e8-4d60-bbc1-d735af7d52d6_2566x1302.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:739,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2229985,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://mlnotes.substack.com/i/162363871?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9d231fc6-15e8-4d60-bbc1-d735af7d52d6_2566x1302.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!_mIx!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9d231fc6-15e8-4d60-bbc1-d735af7d52d6_2566x1302.png 424w, https://substackcdn.com/image/fetch/$s_!_mIx!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9d231fc6-15e8-4d60-bbc1-d735af7d52d6_2566x1302.png 848w, https://substackcdn.com/image/fetch/$s_!_mIx!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9d231fc6-15e8-4d60-bbc1-d735af7d52d6_2566x1302.png 1272w, https://substackcdn.com/image/fetch/$s_!_mIx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9d231fc6-15e8-4d60-bbc1-d735af7d52d6_2566x1302.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Fast forward to today, and we find ourselves in a world where, as Jeremy Utley, adjunct professor of creativity and AI at Stanford University, puts it, "<em>the poorest villager in Palo Alto can have what only Winston Churchill used to have.</em>" We now have AI assistants that can understand our context, voice, and intent, ready to collaborate with us in our creative endeavors.</p><h2>How should we treat AI, a Tool or &#8230;?</h2><p>As we embrace this new era of AI-assisted creativity, we face a crucial decision: </p><div class="pullquote"><p>do we treat AI as a mere tool, or do we think of it as if it was a <strong>teammate</strong>? </p></div><p>This choice can dramatically impact our creative output and productivity.</p><h3>Why Treating AI as a Tool Limits Your Creative Potential</h3><p>Research conducted revealed a surprising finding:  while AI can make people 25% faster, 12% more productive, and improve work quality by 40%, less than 10% of professionals are actually reaping these benefits. </p><h3><strong>In many cases, AI made people less creative.</strong> <strong>Why?</strong></h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!wT21!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14c797b8-b215-4907-83ad-9fdab55ef33f_1994x858.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!wT21!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14c797b8-b215-4907-83ad-9fdab55ef33f_1994x858.png 424w, https://substackcdn.com/image/fetch/$s_!wT21!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14c797b8-b215-4907-83ad-9fdab55ef33f_1994x858.png 848w, https://substackcdn.com/image/fetch/$s_!wT21!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14c797b8-b215-4907-83ad-9fdab55ef33f_1994x858.png 1272w, https://substackcdn.com/image/fetch/$s_!wT21!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14c797b8-b215-4907-83ad-9fdab55ef33f_1994x858.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!wT21!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14c797b8-b215-4907-83ad-9fdab55ef33f_1994x858.png" width="1994" height="858" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/14c797b8-b215-4907-83ad-9fdab55ef33f_1994x858.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:858,&quot;width&quot;:1994,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:635280,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://mlnotes.substack.com/i/162363871?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65263323-7a8d-46d3-b11c-ac0fe4f90fe3_1994x1124.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!wT21!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14c797b8-b215-4907-83ad-9fdab55ef33f_1994x858.png 424w, https://substackcdn.com/image/fetch/$s_!wT21!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14c797b8-b215-4907-83ad-9fdab55ef33f_1994x858.png 848w, https://substackcdn.com/image/fetch/$s_!wT21!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14c797b8-b215-4907-83ad-9fdab55ef33f_1994x858.png 1272w, https://substackcdn.com/image/fetch/$s_!wT21!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14c797b8-b215-4907-83ad-9fdab55ef33f_1994x858.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>What&#8217;s your guess of the results from the above experiment?</h3>
      <p>
          <a href="https://mlnotes.substack.com/p/creativity-is-doing-more-than-the">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[2025 is NOT Going to be The Year of AI Agent, and Here's Why. ]]></title><description><![CDATA[The Long and Winding Road to Agentic AI: Separating Hype from Enterprise Reality]]></description><link>https://mlnotes.substack.com/p/2025-is-not-going-to-be-the-year</link><guid isPermaLink="false">https://mlnotes.substack.com/p/2025-is-not-going-to-be-the-year</guid><dc:creator><![CDATA[Angelina Yang]]></dc:creator><pubDate>Tue, 22 Apr 2025 06:13:24 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!l3xk!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F164d4ec4-5ccd-44ca-8477-0ff02b68a7bb_960x540.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2>The AI hype machine is in overdrive, but most enterprises aren't ready for prime time</h2><p>Artificial intelligence is evolving at a breakneck pace. Every week seems to bring breathless announcements of new AI breakthroughs and capabilities. Amid this frenzy, <strong>agentic AI</strong> - autonomous software agents that can orchestrate complex tasks - has emerged as the next frontier.</p><p>But while the potential of agentic AI is immense, the reality for most enterprises is far more sobering. Despite the hype, true agentic AI remains years away for all but the most advanced organisations. </p><p>As Dave Vellante of <a href="https://thecuberesearch.com/">theCUBE Research</a> puts it, </p><blockquote><p><em>"2025 is not going to be the year of the agent. It'll be the year of agent marketing." </em></p></blockquote><h2>Enterprises are leaning in, but it's still early days for AI adoption</h2><p>Recent survey data from <a href="https://etr.ai/">ETR</a> paints a picture of cautious but growing enterprise AI adoption:</p><ul><li><p><strong>80% of enterprises</strong> are now paying for AI in some form</p></li><li><p><strong>Nearly two-thirds</strong> are tapping into AI APIs</p></li><li><p><strong>39%</strong> are experimenting with open-source models</p></li><li><p><strong>27%</strong> are training proprietary models on-premises</p></li></ul><p>These numbers show enterprises aren't sitting on their hands when it comes to AI. But they're a far cry from the agentic vision being hyped by many vendors.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!l3xk!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F164d4ec4-5ccd-44ca-8477-0ff02b68a7bb_960x540.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!l3xk!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F164d4ec4-5ccd-44ca-8477-0ff02b68a7bb_960x540.jpeg 424w, https://substackcdn.com/image/fetch/$s_!l3xk!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F164d4ec4-5ccd-44ca-8477-0ff02b68a7bb_960x540.jpeg 848w, https://substackcdn.com/image/fetch/$s_!l3xk!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F164d4ec4-5ccd-44ca-8477-0ff02b68a7bb_960x540.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!l3xk!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F164d4ec4-5ccd-44ca-8477-0ff02b68a7bb_960x540.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!l3xk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F164d4ec4-5ccd-44ca-8477-0ff02b68a7bb_960x540.jpeg" width="960" height="540" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/164d4ec4-5ccd-44ca-8477-0ff02b68a7bb_960x540.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:540,&quot;width&quot;:960,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!l3xk!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F164d4ec4-5ccd-44ca-8477-0ff02b68a7bb_960x540.jpeg 424w, https://substackcdn.com/image/fetch/$s_!l3xk!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F164d4ec4-5ccd-44ca-8477-0ff02b68a7bb_960x540.jpeg 848w, https://substackcdn.com/image/fetch/$s_!l3xk!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F164d4ec4-5ccd-44ca-8477-0ff02b68a7bb_960x540.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!l3xk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F164d4ec4-5ccd-44ca-8477-0ff02b68a7bb_960x540.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Source: See Footnote</figcaption></figure></div><h2>The AI infrastructure boom is real, but it's a tale of two speeds</h2><p>Make no mistake - AI is driving massive infrastructure investment. As Dave Vellante notes in his recent analysis:</p><blockquote><p><em>"The data center super cycle kicked in in earnest in 2024. What was essentially a perpetually roughly $200 billion business...exploded in 2024...to 350 billion. That's a 58% growth rate in a single year, which is going to continue at an accelerated pace for a decade."</em></p></blockquote><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!FSKE!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2bd51c3-e88b-4b0e-97b2-ef8eb64d8c8e_960x540.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!FSKE!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2bd51c3-e88b-4b0e-97b2-ef8eb64d8c8e_960x540.jpeg 424w, https://substackcdn.com/image/fetch/$s_!FSKE!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2bd51c3-e88b-4b0e-97b2-ef8eb64d8c8e_960x540.jpeg 848w, https://substackcdn.com/image/fetch/$s_!FSKE!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2bd51c3-e88b-4b0e-97b2-ef8eb64d8c8e_960x540.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!FSKE!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2bd51c3-e88b-4b0e-97b2-ef8eb64d8c8e_960x540.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!FSKE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2bd51c3-e88b-4b0e-97b2-ef8eb64d8c8e_960x540.jpeg" width="960" height="540" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d2bd51c3-e88b-4b0e-97b2-ef8eb64d8c8e_960x540.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:540,&quot;width&quot;:960,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!FSKE!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2bd51c3-e88b-4b0e-97b2-ef8eb64d8c8e_960x540.jpeg 424w, https://substackcdn.com/image/fetch/$s_!FSKE!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2bd51c3-e88b-4b0e-97b2-ef8eb64d8c8e_960x540.jpeg 848w, https://substackcdn.com/image/fetch/$s_!FSKE!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2bd51c3-e88b-4b0e-97b2-ef8eb64d8c8e_960x540.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!FSKE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2bd51c3-e88b-4b0e-97b2-ef8eb64d8c8e_960x540.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Source: Footnote</figcaption></figure></div><p>But this boom is playing out very differently for cloud providers versus traditional enterprises:</p><ul><li><p>Cloud titans like AWS, Google, Microsoft and Meta are pouring over $300 billion into AI data centers this year alone</p></li><li><p>On-premises enterprise AI spending is growing, but at a much slower pace</p></li></ul><p>This bifurcation means cloud AI adoption is rocketing ahead, while on-premises enterprise AI won't reach "escape velocity" until later this decade.</p><h2>2025 is the year of agent marketing, not agent reality</h2><p>Why? Because the vast majority of enterprises still lack the foundational capabilities needed to effectively deploy and manage autonomous AI agents:</p><blockquote><p><em>"Our research shows that more than 85% of enterprises still need major upgrades in data quality, and the other 15% probably don't know that they do," says Vellante.</em></p></blockquote><p>The harsh reality is that most companies are still grappling with basic data management issues. As one example shared by the researchers, "<em>one bank found 6,000 table entries that defined what a customer was.</em>" If you can't even agree on what constitutes a customer, how can you possibly expect AI agents to make accurate, business-critical decisions?</p><p>As Vellante warns,</p><blockquote><p><em>"agents hallucinate when the ground truth is fuzzy."</em> </p></blockquote><p>In other words, garbage in, garbage out &#8211; but now with potentially catastrophic consequences as AI agents act on flawed information.</p><p>This doesn't mean AI won't deliver value in the near-term. Co-pilots and other assistive AI will drive real productivity gains. But the grand vision of fully autonomous agents orchestrating complex business processes? That's still years away for most.</p><h2>There are three critical areas where enterprises are falling short in their readiness for agentic AI:</h2><h3>1. Data quality and lineage remain massive hurdles</h3><p>AI agents need high-quality, well-structured data to function effectively. But most enterprises are drowning in data silos and inconsistencies.</p><h3>2. Integration and orchestration are far from seamless</h3><p>Vendors love to tout "<em>seamless integration</em>" for their AI solutions. The reality is far messier. Decades of patchwork enterprise integration efforts have left a tangled web that autonomous agents can't easily navigate.</p><p>Some SaaS vendors are making progress within their own domains. But cross-application integration remains a major challenge. The risk is ending up with siloed AI agents that can't effectively collaborate across business functions.</p><p><strong>What&#8217;s even more important but very few vendors factor it in their plan is&#8230;</strong> </p>
      <p>
          <a href="https://mlnotes.substack.com/p/2025-is-not-going-to-be-the-year">
              Read more
          </a>
      </p>
   ]]></content:encoded></item><item><title><![CDATA[Fake It Till You Make It: HyDE, Step-Back Prompts, Hybrid Search, and More]]></title><description><![CDATA[As you may know already,]]></description><link>https://mlnotes.substack.com/p/fake-it-till-you-make-it-hyde-step</link><guid isPermaLink="false">https://mlnotes.substack.com/p/fake-it-till-you-make-it-hyde-step</guid><dc:creator><![CDATA[Angelina Yang]]></dc:creator><pubDate>Thu, 17 Apr 2025 05:26:02 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Xwbd!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fce45a431-9580-40c1-8ed5-042f84fa654e_1914x1150.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>As you may know already,</p><div class="pullquote"><p><strong>"There are a lot of issues with vanilla RAG."</strong></p></div><p>Are you tired of AI giving you half-baked answers? Frustrated with language models that seem to know everything but your specific data? </p><p>You're not alone. While Retrieval Augmented Generation (RAG) promised to bridge the gap between AI and your proprietary information, many are finding &#8230;</p>
      <p>
          <a href="https://mlnotes.substack.com/p/fake-it-till-you-make-it-hyde-step">
              Read more
          </a>
      </p>
   ]]></content:encoded></item></channel></rss>