<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[💎DiamantAI]]></title><description><![CDATA[DiamantAI is the top 0.1% newsletter for staying ahead in AI, uncovering the latest techniques, breakthroughs, insights, and unique tutorials.]]></description><link>https://newsletter.diamant-ai.com</link><image><url>https://substackcdn.com/image/fetch/$s_!72Rv!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84bf24d0-f0ec-49fc-8e8f-800eec27706d_1280x1280.png</url><title>💎DiamantAI</title><link>https://newsletter.diamant-ai.com</link></image><generator>Substack</generator><lastBuildDate>Thu, 11 Jun 2026 19:21:49 GMT</lastBuildDate><atom:link href="https://newsletter.diamant-ai.com/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[DiamantAI]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[diamantai@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[diamantai@substack.com]]></itunes:email><itunes:name><![CDATA[Nir Diamant]]></itunes:name></itunes:owner><itunes:author><![CDATA[Nir Diamant]]></itunes:author><googleplay:owner><![CDATA[diamantai@substack.com]]></googleplay:owner><googleplay:email><![CDATA[diamantai@substack.com]]></googleplay:email><googleplay:author><![CDATA[Nir Diamant]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[Building with AI is easy. Shipping is hard]]></title><description><![CDATA[What separates an AI prototype from software you can ship, and the method I spent the last months turning into a course]]></description><link>https://newsletter.diamant-ai.com/p/building-with-ai-is-easy-shipping</link><guid isPermaLink="false">https://newsletter.diamant-ai.com/p/building-with-ai-is-easy-shipping</guid><dc:creator><![CDATA[Nir Diamant]]></dc:creator><pubDate>Wed, 10 Jun 2026 21:28:56 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!TTQP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60cb6fff-bdcb-4a9d-a662-5ce025161420_1000x563.gif" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hello everyone,</p><p>Quick one today, because it&#8217;s news. For the past months, my co-founder and I have been working days and nights on something I&#8217;ve never offered before, and you&#8217;re hearing about it first.</p><p>Building a prototype with AI is easy. Building to production is hard. We show you how to get there with the right methodology.</p><p>It&#8217;s called <strong>Prompt to Production</strong>: a full course on building software with AI the way professionals do. 16 lectures, each paired with its own hands-on lab, built to take you up gradually, holding your hand exactly as much as you need, from your first structured prompt to a working production system.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!TTQP!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60cb6fff-bdcb-4a9d-a662-5ce025161420_1000x563.gif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!TTQP!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60cb6fff-bdcb-4a9d-a662-5ce025161420_1000x563.gif 424w, https://substackcdn.com/image/fetch/$s_!TTQP!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60cb6fff-bdcb-4a9d-a662-5ce025161420_1000x563.gif 848w, https://substackcdn.com/image/fetch/$s_!TTQP!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60cb6fff-bdcb-4a9d-a662-5ce025161420_1000x563.gif 1272w, https://substackcdn.com/image/fetch/$s_!TTQP!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60cb6fff-bdcb-4a9d-a662-5ce025161420_1000x563.gif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!TTQP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60cb6fff-bdcb-4a9d-a662-5ce025161420_1000x563.gif" width="1000" height="563" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/60cb6fff-bdcb-4a9d-a662-5ce025161420_1000x563.gif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:563,&quot;width&quot;:1000,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1189602,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/gif&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://newsletter.diamant-ai.com/i/201498364?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60cb6fff-bdcb-4a9d-a662-5ce025161420_1000x563.gif&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!TTQP!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60cb6fff-bdcb-4a9d-a662-5ce025161420_1000x563.gif 424w, https://substackcdn.com/image/fetch/$s_!TTQP!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60cb6fff-bdcb-4a9d-a662-5ce025161420_1000x563.gif 848w, https://substackcdn.com/image/fetch/$s_!TTQP!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60cb6fff-bdcb-4a9d-a662-5ce025161420_1000x563.gif 1272w, https://substackcdn.com/image/fetch/$s_!TTQP!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F60cb6fff-bdcb-4a9d-a662-5ce025161420_1000x563.gif 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>It teaches the paradigms behind software that&#8217;s reliable, efficient, and modular, the way the strongest production teams build with AI today. An AI coach works with you inside your own terminal the whole way, and you finish with two things: a method you&#8217;ll use every day, and a real product, deployed and live. About 15 focused hours, at your own pace.</p><p>Two things before it opens to the public:</p><p><strong>Ten people get free early access this month.</strong>  The full course, hands-on, in exchange for honest feedback. It&#8217;s an application, not first-come-first-served, and it&#8217;s inside the form below.</p><p><strong>Everyone on the waiting list locks in the founding price</strong>, lower than what the course will cost at public launch.</p><p>Joining takes about two minutes, and the short questionnaire inside directly shapes the final course and what early members get.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.diamant-ai.com/course-waitlist&quot;,&quot;text&quot;:&quot;Join The Waiting List&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.diamant-ai.com/course-waitlist"><span>Join The Waiting List</span></a></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!j1_h!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb611ea25-3c8c-4100-8607-571c29eeaa46_1280x615.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!j1_h!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb611ea25-3c8c-4100-8607-571c29eeaa46_1280x615.jpeg 424w, https://substackcdn.com/image/fetch/$s_!j1_h!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb611ea25-3c8c-4100-8607-571c29eeaa46_1280x615.jpeg 848w, https://substackcdn.com/image/fetch/$s_!j1_h!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb611ea25-3c8c-4100-8607-571c29eeaa46_1280x615.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!j1_h!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb611ea25-3c8c-4100-8607-571c29eeaa46_1280x615.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!j1_h!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb611ea25-3c8c-4100-8607-571c29eeaa46_1280x615.jpeg" width="1280" height="615" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b611ea25-3c8c-4100-8607-571c29eeaa46_1280x615.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:615,&quot;width&quot;:1280,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:38822,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://newsletter.diamant-ai.com/i/201498364?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb611ea25-3c8c-4100-8607-571c29eeaa46_1280x615.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!j1_h!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb611ea25-3c8c-4100-8607-571c29eeaa46_1280x615.jpeg 424w, https://substackcdn.com/image/fetch/$s_!j1_h!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb611ea25-3c8c-4100-8607-571c29eeaa46_1280x615.jpeg 848w, https://substackcdn.com/image/fetch/$s_!j1_h!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb611ea25-3c8c-4100-8607-571c29eeaa46_1280x615.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!j1_h!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb611ea25-3c8c-4100-8607-571c29eeaa46_1280x615.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.diamant-ai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading &#128142;DiamantAI! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p></p><p></p><p></p>]]></content:encoded></item><item><title><![CDATA[The Best Hacker Alive Is an AI. Anthropic Won't Let You Use It]]></title><description><![CDATA[Mythos finds and exploits zero-day flaws that humans missed for decades. Here's what it can really do, why only twelve companies can touch it, and the one move that still protects what you ship.]]></description><link>https://newsletter.diamant-ai.com/p/anthropic-built-an-ai-that-hacks</link><guid isPermaLink="false">https://newsletter.diamant-ai.com/p/anthropic-built-an-ai-that-hacks</guid><dc:creator><![CDATA[Nir Diamant]]></dc:creator><pubDate>Tue, 02 Jun 2026 22:23:45 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!zEnU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa7ea908d-fe91-4dff-9232-84f1f6b5b0fe_1536x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!kADC!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7edf5412-d069-4fa6-a5c2-12c27903ed27_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!kADC!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7edf5412-d069-4fa6-a5c2-12c27903ed27_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!kADC!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7edf5412-d069-4fa6-a5c2-12c27903ed27_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!kADC!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7edf5412-d069-4fa6-a5c2-12c27903ed27_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!kADC!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7edf5412-d069-4fa6-a5c2-12c27903ed27_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!kADC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7edf5412-d069-4fa6-a5c2-12c27903ed27_1536x1024.png" width="1456" height="971" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7edf5412-d069-4fa6-a5c2-12c27903ed27_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2090968,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://diamantai.substack.com/i/200362817?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7edf5412-d069-4fa6-a5c2-12c27903ed27_1536x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!kADC!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7edf5412-d069-4fa6-a5c2-12c27903ed27_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!kADC!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7edf5412-d069-4fa6-a5c2-12c27903ed27_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!kADC!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7edf5412-d069-4fa6-a5c2-12c27903ed27_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!kADC!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7edf5412-d069-4fa6-a5c2-12c27903ed27_1536x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div><hr></div><p><strong>A quick note before we dive in today!</strong> &#128075;</p><p>Two months ago, I released my book, <em><strong>RAG Made Simple</strong></em> - built on the foundation of my 28K-star GitHub repo - and it officially became an Amazon Bestseller!</p><p>To help me continue creating free content for you, I would be incredibly grateful if you could take 30 seconds to <strong><a href="https://www.amazon.com/RAG-Made-Simple-Retrieval-Augmented-Engineering-ebook/dp/B0D76734SZ">leave a review on Amazon</a></strong> if you&#8217;ve already read it.</p><p>Haven&#8217;t grabbed your copy yet? Feel free to check it out (it is extremely underpriced! &#128521;).</p><div><hr></div><p>Anthropic handed an AI a piece of code that had been sitting in the open since the 1990s. Audited for decades. Trusted by half the internet.</p><p>It found a way in that nobody had caught in 27 years. Then it wrote the break-in itself. Working. Start to finish. No human help.</p><p>The AI is called Mythos. And that ancient bug is the least frightening thing it did.</p><h2>It Didn&#8217;t Stop There</h2><p>Mythos didn&#8217;t find one bug. Anthropic pointed it at the software the whole world runs on, and it just kept opening doors.</p><p>That 27-year-old hole was in OpenBSD, the networking code everyone trusts to be bulletproof. Then came a 16-year-old flaw in FFmpeg, the video engine inside half the players you&#8217;ve ever touched. Then a hole in FreeBSD&#8217;s file sharing that hands a total stranger full control of the server with no password, now tracked as CVE-2026-4747. In the Linux kernel it didn&#8217;t even bother with single bugs. It chained two, three, four of them into one clean attack, the kind of move that takes a senior researcher weeks.</p><p>Then it stopped hunting bugs and went after a whole network.</p><p>Anthropic gave it a simulated company to break into. Thirty-two steps, the kind of job a human red team needs about twenty hours to finish. Mythos walked the entire chain, front door to crown jewels, on its own. Not once by luck. Over and over.</p><p>These aren&#8217;t typos it caught. These are flaws thousands of brilliant engineers stared straight at for decades and missed.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.diamant-ai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.diamant-ai.com/subscribe?"><span>Subscribe now</span></a></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!qYw3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ed15cce-e6b3-4500-b830-6c13c2a23ca9_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!qYw3!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ed15cce-e6b3-4500-b830-6c13c2a23ca9_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!qYw3!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ed15cce-e6b3-4500-b830-6c13c2a23ca9_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!qYw3!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ed15cce-e6b3-4500-b830-6c13c2a23ca9_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!qYw3!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ed15cce-e6b3-4500-b830-6c13c2a23ca9_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!qYw3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ed15cce-e6b3-4500-b830-6c13c2a23ca9_1536x1024.png" width="1456" height="971" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8ed15cce-e6b3-4500-b830-6c13c2a23ca9_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2159909,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://diamantai.substack.com/i/200362817?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ed15cce-e6b3-4500-b830-6c13c2a23ca9_1536x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!qYw3!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ed15cce-e6b3-4500-b830-6c13c2a23ca9_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!qYw3!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ed15cce-e6b3-4500-b830-6c13c2a23ca9_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!qYw3!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ed15cce-e6b3-4500-b830-6c13c2a23ca9_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!qYw3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8ed15cce-e6b3-4500-b830-6c13c2a23ca9_1536x1024.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><h2>Why This Is Suddenly Possible</h2><p>Here&#8217;s why this lands harder than the last scary headline you scrolled past.</p><p>Think of the best safecracker who ever lived. The reason your front door holds isn&#8217;t that the lock is flawless. It&#8217;s that people that good are rare, expensive, and can only stand in front of one door at a time.</p><p>Mythos erases all three. It has the skill, it never sleeps, and Anthropic can run a thousand copies of it at once. One test turned a thousand open-source projects inside out for less than the price of a used car.</p><p>The locks didn&#8217;t get weaker. The world just got a million more safecrackers, and not one of them needs to eat, sleep, or get paid.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!zEnU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa7ea908d-fe91-4dff-9232-84f1f6b5b0fe_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!zEnU!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa7ea908d-fe91-4dff-9232-84f1f6b5b0fe_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!zEnU!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa7ea908d-fe91-4dff-9232-84f1f6b5b0fe_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!zEnU!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa7ea908d-fe91-4dff-9232-84f1f6b5b0fe_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!zEnU!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa7ea908d-fe91-4dff-9232-84f1f6b5b0fe_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!zEnU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa7ea908d-fe91-4dff-9232-84f1f6b5b0fe_1536x1024.png" width="1456" height="971" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a7ea908d-fe91-4dff-9232-84f1f6b5b0fe_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2907565,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://diamantai.substack.com/i/200362817?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa7ea908d-fe91-4dff-9232-84f1f6b5b0fe_1536x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!zEnU!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa7ea908d-fe91-4dff-9232-84f1f6b5b0fe_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!zEnU!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa7ea908d-fe91-4dff-9232-84f1f6b5b0fe_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!zEnU!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa7ea908d-fe91-4dff-9232-84f1f6b5b0fe_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!zEnU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa7ea908d-fe91-4dff-9232-84f1f6b5b0fe_1536x1024.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><h2>So Anthropic Locked It Away</h2><p>Now the strange part. Anthropic built the most capable hacker on earth, looked at it, and refused to sell it.</p><p>You can&#8217;t download Mythos. You can&#8217;t pay for it. Instead Anthropic slipped it to twelve companies behind a locked door, a program it calls Project Glasswing, and the guest list is exactly who you&#8217;d guess: Apple, Google, Microsoft, Amazon, NVIDIA, JPMorgan Chase, the Linux Foundation, and a few more. It even handed them 100 million dollars in credits so they&#8217;d actually put it to work.</p><p>The plan is to let the people guarding the world&#8217;s important software find the holes first, before anyone with worse intentions gets a tool this sharp.</p><p>Because a tool this sharp is coming either way. Anthropic has already said so. A future model will carry the same power wrapped in new safety controls, and that one won&#8217;t stay behind a locked door.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.diamant-ai.com/p/anthropic-built-an-ai-that-hacks?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.diamant-ai.com/p/anthropic-built-an-ai-that-hacks?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Hy1Y!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffeae436c-4938-4feb-9539-fdb5eaa57589_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Hy1Y!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffeae436c-4938-4feb-9539-fdb5eaa57589_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!Hy1Y!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffeae436c-4938-4feb-9539-fdb5eaa57589_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!Hy1Y!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffeae436c-4938-4feb-9539-fdb5eaa57589_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!Hy1Y!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffeae436c-4938-4feb-9539-fdb5eaa57589_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Hy1Y!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffeae436c-4938-4feb-9539-fdb5eaa57589_1536x1024.png" width="1456" height="971" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/feae436c-4938-4feb-9539-fdb5eaa57589_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1921761,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://diamantai.substack.com/i/200362817?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffeae436c-4938-4feb-9539-fdb5eaa57589_1536x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Hy1Y!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffeae436c-4938-4feb-9539-fdb5eaa57589_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!Hy1Y!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffeae436c-4938-4feb-9539-fdb5eaa57589_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!Hy1Y!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffeae436c-4938-4feb-9539-fdb5eaa57589_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!Hy1Y!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffeae436c-4938-4feb-9539-fdb5eaa57589_1536x1024.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><h2>Which Means It&#8217;s Coming For You</h2><p>So picture your own stack for a second.</p><p>A login service. A payment integration. A few open-source libraries you last updated, and you don&#8217;t actually remember when. Today, finding the weak spot in there takes a skilled attacker real time and real motivation, and most days nobody bothers.</p><p>Soon it takes a cheap model an afternoon, and the model never gets bored.</p><p>That forgotten library is an unlocked door. The line of things that can walk through it just went from a few rare experts to anything with an API key.</p><p>The fix isn&#8217;t clever. It&#8217;s the boring stuff you already know and keep putting off. Patch the moment a fix ships, because the gap between a released patch and a working exploit is shrinking toward nothing. Give every service the least access it can survive on, so one open door doesn&#8217;t hand over the whole house. Watch your logs, so when something starts rattling the handles, you notice.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!K0zF!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c4dd50c-f573-46b2-95e1-860dd0fa1145_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!K0zF!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c4dd50c-f573-46b2-95e1-860dd0fa1145_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!K0zF!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c4dd50c-f573-46b2-95e1-860dd0fa1145_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!K0zF!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c4dd50c-f573-46b2-95e1-860dd0fa1145_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!K0zF!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c4dd50c-f573-46b2-95e1-860dd0fa1145_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!K0zF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c4dd50c-f573-46b2-95e1-860dd0fa1145_1536x1024.png" width="1456" height="971" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2c4dd50c-f573-46b2-95e1-860dd0fa1145_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2347229,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://diamantai.substack.com/i/200362817?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c4dd50c-f573-46b2-95e1-860dd0fa1145_1536x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!K0zF!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c4dd50c-f573-46b2-95e1-860dd0fa1145_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!K0zF!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c4dd50c-f573-46b2-95e1-860dd0fa1145_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!K0zF!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c4dd50c-f573-46b2-95e1-860dd0fa1145_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!K0zF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c4dd50c-f573-46b2-95e1-860dd0fa1145_1536x1024.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><h2>The Part That Should Calm You Down</h2><p>One thing keeps this from being a horror story, at least for now.</p><p>When the UK&#8217;s AI Safety Institute tested Mythos, it tore through small, lightly guarded systems. But it was fighting empty rooms. No defenders, no alarms, nobody fighting back. A real network with a watchful team and decent tooling is a far harder thing to crack. And the sharpest version isn&#8217;t loose in the wild at all. It&#8217;s locked in that room with the twelve giants, pointed at defense.</p><p>So this isn&#8217;t the night the internet falls. It&#8217;s the night the balance tips. For a while, attackers hold the faster tool, and defenders hold a head start measured in months.</p><p>The best safecracker who ever lived turned out to be software, and it doesn&#8217;t sleep.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ZTTa!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad31f2a4-73b5-4968-b646-54caad545faf_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ZTTa!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad31f2a4-73b5-4968-b646-54caad545faf_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!ZTTa!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad31f2a4-73b5-4968-b646-54caad545faf_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!ZTTa!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad31f2a4-73b5-4968-b646-54caad545faf_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!ZTTa!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad31f2a4-73b5-4968-b646-54caad545faf_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ZTTa!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad31f2a4-73b5-4968-b646-54caad545faf_1536x1024.png" width="1456" height="971" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ad31f2a4-73b5-4968-b646-54caad545faf_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2145726,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://diamantai.substack.com/i/200362817?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad31f2a4-73b5-4968-b646-54caad545faf_1536x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ZTTa!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad31f2a4-73b5-4968-b646-54caad545faf_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!ZTTa!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad31f2a4-73b5-4968-b646-54caad545faf_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!ZTTa!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad31f2a4-73b5-4968-b646-54caad545faf_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!ZTTa!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fad31f2a4-73b5-4968-b646-54caad545faf_1536x1024.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p> </p><p>Anthropic locked the first one in a vault. The rest of us got a warning instead of a key.</p><p>Spend it. Go patch the thing you&#8217;ve been putting off.</p><div><hr></div><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.diamant-ai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading <strong>&#128142;DiamantAI!</strong> I share cutting-edge AI insights, tutorials, and breakthroughs. Subscribe for free to get new posts delivered straight to your inbox, and as a bonus, you&#8217;ll receive a <strong>33% discount coupon</strong> for both of my digital books, <em><a href="https://nirdiamant.gumroad.com/l/rag-made-simple">RAG Made Simple</a></em> and <em><a href="https://nirdiamant.gumroad.com/l/mtxrfk">Prompt Engineering: From Zero to Hero</a></em>. Enjoy!</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p></p>]]></content:encoded></item><item><title><![CDATA[Claude Code vs Codex CLI]]></title><description><![CDATA[Hello everyone,]]></description><link>https://newsletter.diamant-ai.com/p/claude-code-vs-codex-cli</link><guid isPermaLink="false">https://newsletter.diamant-ai.com/p/claude-code-vs-codex-cli</guid><dc:creator><![CDATA[Nir Diamant]]></dc:creator><pubDate>Thu, 21 May 2026 15:32:23 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!MWXI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc15b4dd-7bdd-4d21-b4e5-86633b4add36_1408x768.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!MWXI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc15b4dd-7bdd-4d21-b4e5-86633b4add36_1408x768.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!MWXI!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc15b4dd-7bdd-4d21-b4e5-86633b4add36_1408x768.png 424w, https://substackcdn.com/image/fetch/$s_!MWXI!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc15b4dd-7bdd-4d21-b4e5-86633b4add36_1408x768.png 848w, https://substackcdn.com/image/fetch/$s_!MWXI!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc15b4dd-7bdd-4d21-b4e5-86633b4add36_1408x768.png 1272w, https://substackcdn.com/image/fetch/$s_!MWXI!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc15b4dd-7bdd-4d21-b4e5-86633b4add36_1408x768.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!MWXI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc15b4dd-7bdd-4d21-b4e5-86633b4add36_1408x768.png" width="1408" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/dc15b4dd-7bdd-4d21-b4e5-86633b4add36_1408x768.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1408,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!MWXI!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc15b4dd-7bdd-4d21-b4e5-86633b4add36_1408x768.png 424w, https://substackcdn.com/image/fetch/$s_!MWXI!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc15b4dd-7bdd-4d21-b4e5-86633b4add36_1408x768.png 848w, https://substackcdn.com/image/fetch/$s_!MWXI!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc15b4dd-7bdd-4d21-b4e5-86633b4add36_1408x768.png 1272w, https://substackcdn.com/image/fetch/$s_!MWXI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc15b4dd-7bdd-4d21-b4e5-86633b4add36_1408x768.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Hello everyone,</p><p>Before we begin, I want to share some exciting news: I&#8217;ve launched a pilot collaboration designed to <strong>help you find your next role.</strong></p><p>You can browse the open positions on my website: <a href="https://www.diamant-ai.com/jobs">https://www.diamant-ai.com/jobs</a></p><p>The idea is simple. I&#8217;m partnering with great companies to help them connect with great candidates from my community. Here&#8217;s how it works:</p><ol><li><p>Review the open roles on the jobs page</p></li><li><p>Choose one that fits your background</p></li><li><p>Upload a 3-minute video introducing yourself and walking through a relevant project you&#8217;ve worked on</p></li></ol><p>We&#8217;ll take care of the rest.</p><p>Looking forward to seeing your submissions!</p><div><hr></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.diamant-ai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.diamant-ai.com/subscribe?"><span>Subscribe now</span></a></p><p></p><p>Most developers spend a week comparing Claude Code and Codex CLI, pick whichever their team already trusts, and never open the other. The engineers who ship the most do something stranger. They wire the two tools together so one reviews the other automatically. That single decision changes how much they spend, which bugs reach production, and how late they stay at the keyboard.</p><p>The comparison framing is the trap. These tools fail in opposite ways. They cost different amounts in real-world use, not the amounts the marketing pages suggest. And the open-source pattern that combines them has quietly become the standard for serious work in 2026.</p><p>By the end of this post you will know the real cost gap (closer to ten times than four times once you run real refactors), the failure mode each tool reliably ships, and the MCP setup that turns them into a planner-and-reviewer team.</p><h2><strong>The Cost Gap </strong></h2><p>The &#8220;Codex CLI uses four times fewer tokens&#8221; claim is widely repeated. It also understates the bill. Run an Express.js backend refactor through both tools and pay in API tokens. Claude Code lands around one hundred and fifty-five dollars. Codex CLI finishes the same refactor for fifteen. Ten times the bill, not four.</p><p>The compounding factor is verbosity. Claude Code narrates. It explains every step before it acts. Output tokens are the most expensive category, and Opus 4.7 charges five times the output rate of GPT-5.4. Verbose reasoning is a feature in the marketing. It is a meter in your invoice.</p><p>Subscription tiers do not bail you out. Plan on losing about a fifth of your week to waiting on Claude Code rate limits. Heavy users exhaust the twenty-dollar Pro plan within five complex prompts and migrate to the Max tier at one hundred or two hundred dollars just to keep moving. A loud contingent of those same engineers cancel Max and switch to Codex CLI on the twenty-dollar Plus plan instead.</p><p>OpenAI has its own trap. In April 2026 they quietly moved agentic Codex workflows from flat subscription billing into API metering. Engineers are getting bills in the thousands for sessions they thought were included. Pick your tool based on which billing surprise you are willing to absorb.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!3DPc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F583303f8-395a-4470-b428-d58e85105fe8_1408x768.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!3DPc!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F583303f8-395a-4470-b428-d58e85105fe8_1408x768.png 424w, https://substackcdn.com/image/fetch/$s_!3DPc!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F583303f8-395a-4470-b428-d58e85105fe8_1408x768.png 848w, https://substackcdn.com/image/fetch/$s_!3DPc!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F583303f8-395a-4470-b428-d58e85105fe8_1408x768.png 1272w, https://substackcdn.com/image/fetch/$s_!3DPc!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F583303f8-395a-4470-b428-d58e85105fe8_1408x768.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!3DPc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F583303f8-395a-4470-b428-d58e85105fe8_1408x768.png" width="1408" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/583303f8-395a-4470-b428-d58e85105fe8_1408x768.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1408,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!3DPc!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F583303f8-395a-4470-b428-d58e85105fe8_1408x768.png 424w, https://substackcdn.com/image/fetch/$s_!3DPc!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F583303f8-395a-4470-b428-d58e85105fe8_1408x768.png 848w, https://substackcdn.com/image/fetch/$s_!3DPc!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F583303f8-395a-4470-b428-d58e85105fe8_1408x768.png 1272w, https://substackcdn.com/image/fetch/$s_!3DPc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F583303f8-395a-4470-b428-d58e85105fe8_1408x768.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2><strong>How Each Tool Catastrophically Fails</strong></h2><p>Claude Code&#8217;s signature failure is context drift. By hour three of a continuous session, the agent stops referencing the codebase and starts referencing what it itself said an hour earlier. Think of a tired surgeon working from memory of the patient she saw at the start of her shift. Push past three hours and you reliably watch tasks get abandoned mid-stream.</p><p>Multi-file refactors are where this breaks worst. Claude Code edits the primary file cleanly, then loses the dependency chain. You spend an hour stitching exports, imports, and downstream consumers back together. Anthropic&#8217;s marketing says the tool plans across files. The reality is one file at a time, fresh session each task, or you pay for it.</p><p>Claude Code also has a quiet bug factory in test generation. The tests it writes pass with a green check but assert the wrong behavior. It will preemptively mock browser APIs just to keep a test from crashing, silently bypassing the logic the test was supposed to verify. If you trust the green check, that habit ships bugs.</p><p>Codex CLI fails differently. Its output is &#8220;almost correct&#8221; code. It compiles. It passes the existing tests. It contains an integration bug that fires only under production load. A Codex /goal session can run twenty-five hours unattended, burn thirteen million tokens, and ship thirty thousand lines of code no human reads closely. Impressive endurance. Also a perfect way to merge a subtle disaster.</p><p>Codex CLI hangs silently in CI too. Skip the codex-yolo alias or the approval policy override and the agent will sit indefinitely at an approval prompt, burning your runtime budget while waiting for a keystroke that cannot arrive.</p><p>The failure modes map cleanly to task assignments. Use Claude Code where wrong code costs the most: payment paths, frontend users see, anything touching production money. Use Codex CLI where you can verify cheaply before merging: DevOps scripts, batch test generation, scaffolding.</p><h2><strong>The MCP Bridge </strong></h2><p>The pattern that became standard is not running two terminals. It is wiring Codex CLI as a Model Context Protocol server inside Claude Code so the two agents review each other&#8217;s work without you swivel-chairing between them.</p><p>OpenAI&#8217;s official plugin openai/codex-plugin-cc automates this. One plugin install inside Claude Code wires Codex into your session. After that, the workflow is mechanical. Claude Opus uses its Plan agent to research the codebase and propose a structured plan. Codex audits the plan for correctness and security before a single line of code is written. Claude Sonnet implements the agreed plan to keep cost low. Codex reviews the resulting git diff and returns one of three structured verdicts: APPROVED, WARNING, or BLOCKED. A BLOCKED verdict triggers up to three automatic repair cycles. No human in the middle.</p><p>Why this matters: an AI agent cannot review its own work. Claude in particular is stubborn and sycophantic about its own outputs. Ask it to review what it just wrote and it confidently affirms its own mistakes. The fix is mechanical, not philosophical. Route the review to a different model family. The MCP bridge makes that mechanical instead of manual.</p><p>If you do not want to install a plugin, the lighter setup is two terminal tabs and one shared instructions file. Codex CLI reads AGENTS.md natively. Claude Code does not, but its CLAUDE.md supports an @AGENTS.md import line that pulls the contents in at session start. One source of truth. Two tools. Five seconds to configure. Six months saved on drift.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.diamant-ai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.diamant-ai.com/subscribe?"><span>Subscribe now</span></a></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!OAAD!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d8e94b6-16bf-4c8f-a8d1-7a65f9fce231_1408x768.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!OAAD!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d8e94b6-16bf-4c8f-a8d1-7a65f9fce231_1408x768.png 424w, https://substackcdn.com/image/fetch/$s_!OAAD!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d8e94b6-16bf-4c8f-a8d1-7a65f9fce231_1408x768.png 848w, https://substackcdn.com/image/fetch/$s_!OAAD!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d8e94b6-16bf-4c8f-a8d1-7a65f9fce231_1408x768.png 1272w, https://substackcdn.com/image/fetch/$s_!OAAD!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d8e94b6-16bf-4c8f-a8d1-7a65f9fce231_1408x768.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!OAAD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d8e94b6-16bf-4c8f-a8d1-7a65f9fce231_1408x768.png" width="1408" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2d8e94b6-16bf-4c8f-a8d1-7a65f9fce231_1408x768.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1408,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!OAAD!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d8e94b6-16bf-4c8f-a8d1-7a65f9fce231_1408x768.png 424w, https://substackcdn.com/image/fetch/$s_!OAAD!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d8e94b6-16bf-4c8f-a8d1-7a65f9fce231_1408x768.png 848w, https://substackcdn.com/image/fetch/$s_!OAAD!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d8e94b6-16bf-4c8f-a8d1-7a65f9fce231_1408x768.png 1272w, https://substackcdn.com/image/fetch/$s_!OAAD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2d8e94b6-16bf-4c8f-a8d1-7a65f9fce231_1408x768.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2><strong>When To Pick Just One</strong></h2><p>You will not always have both tools open. Default to Claude Code for frontend work, multi-file refactors when you can supervise, complex features where architectural integrity matters, and anything React. Side by side, blind reviewers prefer Claude Code&#8217;s code about two thirds of the time. The quality gap is real even when SWE-bench scores look close.</p><p>Default to Codex CLI for autonomous batch work, DevOps scripts, scaffolding, anything you can verify with a fast test suite, and anything that lives in the shell. Codex CLI&#8217;s twelve-point lead on Terminal-Bench 2.0 maps to real reliability differences in scripting and system administration.</p><p>If you work in regulated code, Codex CLI&#8217;s sandbox is the easier compliance story. Seatbelt on macOS. Landlock and bwrap on Linux. Network off by default. Anthropic recently had to patch a sandbox escape in Claude Code&#8217;s application-layer enforcement. Kernel boundaries beat hooks when the threat model is real.</p><p>The anti-pattern is letting whichever assistant happens to be open handle the next task. That is how you end up paying premium Claude tokens to scaffold boilerplate. It is also how Codex CLI quietly rewrites your billing logic because you forgot to switch tools when the ticket type changed.</p><h2><strong>What Neither Tool Will Save You From</strong></h2><p>Both tools follow instructions. Neither decides whether the instruction is right. If your AGENTS.md says retry every network call and one endpoint must not retry, both tools ship the bug.</p><p>Both tools assume your tests are real. Claude Code&#8217;s habit of writing passing tests that assert wrong behavior makes a weak test suite worse, not better. Codex CLI&#8217;s overnight runs produce value only if the test that stops the loop is one you actually trust.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.diamant-ai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading <strong>&#128142;DiamantAI!</strong> I share cutting-edge AI insights, tutorials, and breakthroughs. Subscribe for free to get new posts delivered straight to your inbox, and as a bonus, you&#8217;ll receive a <strong>33% discount coupon</strong> for both of my digital books, <em><a href="https://nirdiamant.gumroad.com/l/rag-made-simple">RAG Made Simple</a></em> and <em><a href="https://nirdiamant.gumroad.com/l/mtxrfk">Prompt Engineering: From Zero to Hero</a></em>. Enjoy!</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p></p>]]></content:encoded></item><item><title><![CDATA[30 FREE Tutorials to Build AI Agents With Real Memory Fast!]]></title><description><![CDATA[This is a must-open newsletter for anyone building AI agents that need to remember]]></description><link>https://newsletter.diamant-ai.com/p/30-free-tutorials-to-build-ai-agents</link><guid isPermaLink="false">https://newsletter.diamant-ai.com/p/30-free-tutorials-to-build-ai-agents</guid><dc:creator><![CDATA[Nir Diamant]]></dc:creator><pubDate>Wed, 06 May 2026 12:03:21 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!P3SL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0caf051-609e-40d5-ac19-52d65ecba5a9_1746x1436.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey dear readers and community,</p><p>I&#8217;m thrilled to share this blog post, and I&#8217;m choosing my words carefully so you can truly understand the value of the content I&#8217;m sharing today!</p><p>This is something I&#8217;ve spent the last few weeks pulling together: a hands-on guide to every major agent memory technique, with a runnable notebook for each one.</p><p>It isn&#8217;t tied to a single library or framework. It walks through the full landscape, from the simplest conversation buffer to production-grade tiered systems, so you can compare patterns side by side and pick the right one for what you&#8217;re building.</p><p>This comes together in a new open-source GitHub repository called Agent Memory Techniques. It currently contains 30 individual tutorials, organized into 11 essential categories:</p><p>&#128640; <strong>Explore the repository on GitHub &#8211; Agent Memory Techniques:</strong></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://github.com/NirDiamant/Agent_Memory_Techniques&quot;,&quot;text&quot;:&quot;Explore the repository on Github&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://github.com/NirDiamant/Agent_Memory_Techniques"><span>Explore the repository on Github</span></a></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!P3SL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0caf051-609e-40d5-ac19-52d65ecba5a9_1746x1436.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!P3SL!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0caf051-609e-40d5-ac19-52d65ecba5a9_1746x1436.png 424w, https://substackcdn.com/image/fetch/$s_!P3SL!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0caf051-609e-40d5-ac19-52d65ecba5a9_1746x1436.png 848w, https://substackcdn.com/image/fetch/$s_!P3SL!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0caf051-609e-40d5-ac19-52d65ecba5a9_1746x1436.png 1272w, https://substackcdn.com/image/fetch/$s_!P3SL!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0caf051-609e-40d5-ac19-52d65ecba5a9_1746x1436.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!P3SL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0caf051-609e-40d5-ac19-52d65ecba5a9_1746x1436.png" width="1456" height="1197" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f0caf051-609e-40d5-ac19-52d65ecba5a9_1746x1436.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1197,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:560227,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://diamantai.substack.com/i/196591843?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0caf051-609e-40d5-ac19-52d65ecba5a9_1746x1436.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!P3SL!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0caf051-609e-40d5-ac19-52d65ecba5a9_1746x1436.png 424w, https://substackcdn.com/image/fetch/$s_!P3SL!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0caf051-609e-40d5-ac19-52d65ecba5a9_1746x1436.png 848w, https://substackcdn.com/image/fetch/$s_!P3SL!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0caf051-609e-40d5-ac19-52d65ecba5a9_1746x1436.png 1272w, https://substackcdn.com/image/fetch/$s_!P3SL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff0caf051-609e-40d5-ac19-52d65ecba5a9_1746x1436.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><h2>Short-Term Memory</h2><p>Manage what your agent remembers inside a single conversation. Master conversation buffers, sliding windows, summary memory, and token budgets so your agent stays coherent without blowing up the context window.</p><h2>Long-Term Memory</h2><p>Save knowledge that survives across sessions, users, and time. Learn the storage patterns that turn a one-shot chatbot into an agent that builds on past interactions instead of starting from zero every time.</p><h2>Vector Stores &amp; Embeddings</h2><p>Turn past messages and documents into vectors and search them by meaning instead of keywords. Build a retrieval layer that finds the right memory at the right moment, even when the user phrases the question in a totally new way.</p><h2>Knowledge Graphs</h2><p>Build a graph of how entities, people, and projects connect. Walk the graph to reason over what your agent has learned and surface insights a flat memory store would miss.</p><h2>Episodic &amp; Semantic Memory</h2><p>Borrow two of the brain&#8217;s most powerful patterns. Store complete interactions with when-and-where context, then distill general facts on top so your agent can recall both what happened and what it learned from it.</p><h2>Cognitive Architectures</h2><p>Build human-inspired memory systems with working memory, hierarchical layers, consolidation, and self-reflection. Your agent learns to prioritize, forget, and rewrite its own memory the way a person does.</p><h2>Memory Retrieval &amp; Routing</h2><p>Pick the right memory at the right moment. Compare semantic search, recency, hybrid scoring, diversity, and re-ranking, then route reads and writes by content type and intent.</p><h2>Cross-Session &amp; Multi-Agent Memory</h2><p>Save and reload state across sessions so users pick up where they left off. Then share memory across multi-agent teams with namespaces and conflict resolution baked in.</p><h2>Memory Frameworks (Mem0, Letta, Zep, Graphiti)</h2><p>Get hands-on with the leading production memory libraries. Learn when managed services like Mem0 and Zep win, when self-editing memory like Letta/MemGPT shines, and when Graphiti&#8217;s time-aware graphs are the right tool.</p><h2>Memory Evaluation &amp; Benchmarks</h2><p>Measure your memory the way researchers do. Run against LoCoMo and LongMemEval, check retrieval precision and recall, catch staleness and contradictions, and prove your system actually works.</p><h2>Production Memory Patterns</h2><p>Ship memory at real scale. Learn caching, TTLs, sharding, backups, GDPR right-to-forget, and observability so your agent&#8217;s memory survives the messy reality of production traffic.</p><div><hr></div><p>I truly believe this is already the best educational resource available for agent memory, and I&#8217;ll make sure it keeps improving and stays up to date over time.</p><p>If you find value in it, please make sure to star the repo and bookmark it for easy access.</p><p>I hope you enjoy it and that it helps you build agents that actually remember what matters!</p><p>Yours, Nir</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.diamant-ai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading <strong>&#128142;DiamantAI!</strong> I share cutting-edge AI insights, tutorials, and breakthroughs. Subscribe for free to get new posts delivered straight to your inbox, and as a bonus, you&#8217;ll receive a <strong>33% discount coupon</strong> for both of my digital books, <em><a href="https://nirdiamant.gumroad.com/l/rag-made-simple">RAG Made Simple</a></em> and <em><a href="https://nirdiamant.gumroad.com/l/mtxrfk">Prompt Engineering: From Zero to Hero</a></em>. Enjoy!</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p><br></p><p></p>]]></content:encoded></item><item><title><![CDATA[Vibe Training - Auto Train a Small Language Model for Your Use Case]]></title><description><![CDATA[A user finds a weird phrasing.]]></description><link>https://newsletter.diamant-ai.com/p/vibe-training-auto-train-a-small</link><guid isPermaLink="false">https://newsletter.diamant-ai.com/p/vibe-training-auto-train-a-small</guid><dc:creator><![CDATA[Nir Diamant]]></dc:creator><pubDate>Tue, 28 Apr 2026 14:03:18 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!h5qv!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5059417-0657-46c0-b8fd-c124197b880d_2816x1536.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!leEV!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3ecc819-95b8-4f82-bb7b-92100b12f320_2752x1536.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!leEV!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3ecc819-95b8-4f82-bb7b-92100b12f320_2752x1536.png 424w, https://substackcdn.com/image/fetch/$s_!leEV!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3ecc819-95b8-4f82-bb7b-92100b12f320_2752x1536.png 848w, https://substackcdn.com/image/fetch/$s_!leEV!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3ecc819-95b8-4f82-bb7b-92100b12f320_2752x1536.png 1272w, https://substackcdn.com/image/fetch/$s_!leEV!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3ecc819-95b8-4f82-bb7b-92100b12f320_2752x1536.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!leEV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3ecc819-95b8-4f82-bb7b-92100b12f320_2752x1536.png" width="1456" height="813" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e3ecc819-95b8-4f82-bb7b-92100b12f320_2752x1536.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:813,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:8636767,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://diamantai.substack.com/i/195259260?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3ecc819-95b8-4f82-bb7b-92100b12f320_2752x1536.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!leEV!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3ecc819-95b8-4f82-bb7b-92100b12f320_2752x1536.png 424w, https://substackcdn.com/image/fetch/$s_!leEV!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3ecc819-95b8-4f82-bb7b-92100b12f320_2752x1536.png 848w, https://substackcdn.com/image/fetch/$s_!leEV!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3ecc819-95b8-4f82-bb7b-92100b12f320_2752x1536.png 1272w, https://substackcdn.com/image/fetch/$s_!leEV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe3ecc819-95b8-4f82-bb7b-92100b12f320_2752x1536.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>A user finds a weird phrasing. The guardrail misses it. Your customer service bot repeats the same unhelpful answer five times because nobody taught it what &#8220;repetition&#8221; means <em>in your product</em>. Your healthcare chatbot crosses a line it did not know existed. You patch it with a prompt. It works Tuesday. By Thursday, it does not.</p><p>This cycle has a name: <strong>duct tape safety</strong>. And almost every AI product in production is running on it.</p><p>Right now, the default solution is to pass every user interaction through a frontier model with your policy written in the prompt. &#8220;Here is our privacy rule, does this message violate it?&#8221; GPT models do their best. Sometimes that best is great. But you never fully know which one you are getting, and at scale, inconsistency is a liability. This approach also costs real money. Every user message goes through an expensive model. Every call adds latency. And the uncomfortable truth is that a general-purpose model asked to enforce your specific rules is a brilliant generalist pretending to be a specialist. It can fake it. But it cannot <em>be</em> it.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!RYoW!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7aab802-abea-4044-a2b9-ae886388f5a8_2816x1536.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!RYoW!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7aab802-abea-4044-a2b9-ae886388f5a8_2816x1536.png 424w, https://substackcdn.com/image/fetch/$s_!RYoW!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7aab802-abea-4044-a2b9-ae886388f5a8_2816x1536.png 848w, https://substackcdn.com/image/fetch/$s_!RYoW!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7aab802-abea-4044-a2b9-ae886388f5a8_2816x1536.png 1272w, https://substackcdn.com/image/fetch/$s_!RYoW!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7aab802-abea-4044-a2b9-ae886388f5a8_2816x1536.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!RYoW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7aab802-abea-4044-a2b9-ae886388f5a8_2816x1536.png" width="1456" height="794" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d7aab802-abea-4044-a2b9-ae886388f5a8_2816x1536.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:794,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:7479762,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://diamantai.substack.com/i/195259260?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7aab802-abea-4044-a2b9-ae886388f5a8_2816x1536.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!RYoW!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7aab802-abea-4044-a2b9-ae886388f5a8_2816x1536.png 424w, https://substackcdn.com/image/fetch/$s_!RYoW!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7aab802-abea-4044-a2b9-ae886388f5a8_2816x1536.png 848w, https://substackcdn.com/image/fetch/$s_!RYoW!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7aab802-abea-4044-a2b9-ae886388f5a8_2816x1536.png 1272w, https://substackcdn.com/image/fetch/$s_!RYoW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7aab802-abea-4044-a2b9-ae886388f5a8_2816x1536.png 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>What you actually want is a small, dedicated model that has deeply internalized your exact rule. One that has seen thousands of variations, including the sneaky gray-area cases, and has learned precisely where the line is. Fine-tuned classifiers like this exist and they are dramatically better: smaller in size, lower in inference cost, and far more consistent. The problem has always been getting the training data. Labeled examples cost time and money. Most teams never get there. So they stay on duct tape.</p><p>A research team at Plurai just published a framework called <strong>BARRED</strong> that removes that bottleneck entirely. Give it a description of your policy and a handful of unlabeled examples. It builds the training data itself, verifies every label through structured debate, and hands you a deployable classifier. The results are hard to argue with: a 3-billion parameter model trained with BARRED consistently beat GPT-4.1 and purpose-built safety models with significantly more parameters on custom policy tasks. Here is how it actually works.</p><div><hr></div><h2>The Two Ways Synthetic Data Breaks</h2><p>The obvious response to the data problem is: just generate it. Ask an LLM to produce thousands of labeled examples of your policy in action. Simple enough, right?</p><p>Except it breaks in two specific ways, and both are devastating.</p><p>The first is <strong>collapse</strong>. When you ask a language model to generate examples of a policy violation, it gravitates toward the obvious ones. Imagine asking for &#8220;examples of health advice&#8221; and getting fifty variations of the same textbook sentence. The examples cluster around the most clear-cut case. But your classifier does not struggle with clear-cut cases. It struggles with the edges, the situations where a reasonable person might pause and think. If your training data never includes those cases, your model becomes confidently wrong exactly where it matters most.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!zgNT!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F681bea98-9204-48fe-b41d-460fc4484bcb_2816x1536.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!zgNT!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F681bea98-9204-48fe-b41d-460fc4484bcb_2816x1536.png 424w, https://substackcdn.com/image/fetch/$s_!zgNT!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F681bea98-9204-48fe-b41d-460fc4484bcb_2816x1536.png 848w, https://substackcdn.com/image/fetch/$s_!zgNT!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F681bea98-9204-48fe-b41d-460fc4484bcb_2816x1536.png 1272w, https://substackcdn.com/image/fetch/$s_!zgNT!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F681bea98-9204-48fe-b41d-460fc4484bcb_2816x1536.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!zgNT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F681bea98-9204-48fe-b41d-460fc4484bcb_2816x1536.png" width="1456" height="794" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/681bea98-9204-48fe-b41d-460fc4484bcb_2816x1536.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:794,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:6174243,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://diamantai.substack.com/i/195259260?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F681bea98-9204-48fe-b41d-460fc4484bcb_2816x1536.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!zgNT!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F681bea98-9204-48fe-b41d-460fc4484bcb_2816x1536.png 424w, https://substackcdn.com/image/fetch/$s_!zgNT!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F681bea98-9204-48fe-b41d-460fc4484bcb_2816x1536.png 848w, https://substackcdn.com/image/fetch/$s_!zgNT!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F681bea98-9204-48fe-b41d-460fc4484bcb_2816x1536.png 1272w, https://substackcdn.com/image/fetch/$s_!zgNT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F681bea98-9204-48fe-b41d-460fc4484bcb_2816x1536.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The second is <strong>noise</strong>. The same model generating your examples is also labeling them. And language models are not perfectly consistent. They rationalize. They hallucinate. An example that should be labeled &#8220;violation&#8221; sometimes gets labeled &#8220;compliant&#8221; because the generator happened to focus on the wrong sentence. Train on mislabeled data and your model learns the wrong lessons with complete confidence. The fine-tuning makes it worse, not better.</p><p>BARRED was designed specifically to solve both of these, and it does it with two ideas that are independently interesting and together surprisingly powerful.</p><div><hr></div><h2>Step One: Map the Territory Before You Generate Anything</h2><p>The first thing BARRED does is unusual. Instead of jumping straight to generating examples, it first asks: what are all the <em>dimensions</em> along which this policy can play out?</p><p>Take a privacy rule: &#8220;never share the GPS coordinates of individual employees.&#8221; What are the ways this can unfold in a real conversation? The coordinates could be shared explicitly. They could be implied through a nearby landmark. The question might be about a service location, not a person. The response might reference historical location data. The user might be internal staff with a seemingly legitimate reason to ask.</p><p>BARRED identifies these dimensions automatically from your task description and seed examples. It then samples across them systematically, which forces the generated training data to cover the full landscape of your policy, not just the comfortable middle of it. Coverage of the test set increases significantly as more dimensions are added, and model accuracy follows the same curve. Diverse dimensions produce diverse data. Diverse data produces a model that actually generalizes.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!8d11!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f0910fa-5561-4ede-a3fe-95dde00e65c9_2816x1536.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!8d11!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f0910fa-5561-4ede-a3fe-95dde00e65c9_2816x1536.png 424w, https://substackcdn.com/image/fetch/$s_!8d11!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f0910fa-5561-4ede-a3fe-95dde00e65c9_2816x1536.png 848w, https://substackcdn.com/image/fetch/$s_!8d11!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f0910fa-5561-4ede-a3fe-95dde00e65c9_2816x1536.png 1272w, https://substackcdn.com/image/fetch/$s_!8d11!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f0910fa-5561-4ede-a3fe-95dde00e65c9_2816x1536.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!8d11!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f0910fa-5561-4ede-a3fe-95dde00e65c9_2816x1536.png" width="1456" height="794" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1f0910fa-5561-4ede-a3fe-95dde00e65c9_2816x1536.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:794,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:6463874,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://diamantai.substack.com/i/195259260?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f0910fa-5561-4ede-a3fe-95dde00e65c9_2816x1536.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!8d11!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f0910fa-5561-4ede-a3fe-95dde00e65c9_2816x1536.png 424w, https://substackcdn.com/image/fetch/$s_!8d11!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f0910fa-5561-4ede-a3fe-95dde00e65c9_2816x1536.png 848w, https://substackcdn.com/image/fetch/$s_!8d11!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f0910fa-5561-4ede-a3fe-95dde00e65c9_2816x1536.png 1272w, https://substackcdn.com/image/fetch/$s_!8d11!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1f0910fa-5561-4ede-a3fe-95dde00e65c9_2816x1536.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This directly solves the collapse problem. Instead of a pile of similar examples all pointing at the same obvious case, you get a training set that looks like the real world, full of variation, context, and nuance.</p><div><hr></div><h2>Step Two: The Courtroom That Verifies Every Label</h2><p>Solving collapse is only half the problem. You still need the labels to be correct. This is where BARRED does something genuinely clever.</p><p>After generating a candidate training example, it does not trust the label the generator assigned. Instead, it runs a structured multi-agent debate. Think of it as a small courtroom that convenes for every single example before it is allowed into the training set.</p><p>One agent is the <strong>Advocate</strong>. It receives the example and the proposed label, and its job is to argue for that label as forcefully as possible. It does not update. It does not doubt itself. It simply builds the strongest possible case for why the label is correct.</p><p>A panel of <strong>Judge agents</strong> then independently evaluates the example and the Advocate&#8217;s arguments, deliberating over multiple rounds and updating their assessments as they go. The example is only accepted into the training set when every Judge agrees with the Advocate&#8217;s label. Full consensus, or it does not get through.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!dUUW!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f706aef-6a8c-41b1-aadf-deb79669f8c6_2816x1536.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!dUUW!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f706aef-6a8c-41b1-aadf-deb79669f8c6_2816x1536.png 424w, https://substackcdn.com/image/fetch/$s_!dUUW!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f706aef-6a8c-41b1-aadf-deb79669f8c6_2816x1536.png 848w, https://substackcdn.com/image/fetch/$s_!dUUW!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f706aef-6a8c-41b1-aadf-deb79669f8c6_2816x1536.png 1272w, https://substackcdn.com/image/fetch/$s_!dUUW!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f706aef-6a8c-41b1-aadf-deb79669f8c6_2816x1536.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!dUUW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f706aef-6a8c-41b1-aadf-deb79669f8c6_2816x1536.png" width="1456" height="794" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0f706aef-6a8c-41b1-aadf-deb79669f8c6_2816x1536.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:794,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:5244670,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://diamantai.substack.com/i/195259260?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f706aef-6a8c-41b1-aadf-deb79669f8c6_2816x1536.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!dUUW!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f706aef-6a8c-41b1-aadf-deb79669f8c6_2816x1536.png 424w, https://substackcdn.com/image/fetch/$s_!dUUW!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f706aef-6a8c-41b1-aadf-deb79669f8c6_2816x1536.png 848w, https://substackcdn.com/image/fetch/$s_!dUUW!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f706aef-6a8c-41b1-aadf-deb79669f8c6_2816x1536.png 1272w, https://substackcdn.com/image/fetch/$s_!dUUW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0f706aef-6a8c-41b1-aadf-deb79669f8c6_2816x1536.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>When the Judges are unconvinced, they explain exactly why in structured feedback. &#8220;The text never actually names an individual.&#8221; &#8220;The location mentioned is a public service address, not a personal one.&#8221; That feedback goes back to the generator, which produces a refined version of the example. The refined version enters the courtroom again. The process repeats until the example passes or gets discarded after too many failed attempts.</p><p>What makes this design smart is the asymmetry. The Advocate never changes its mind. The Judges do. This means every example has to survive genuine adversarial pressure, not a polite internal review. If the reasoning behind a label cannot convince a skeptical panel, the example probably contains an inconsistency and does not belong in the training data.</p><p>The researchers tested what happens when you remove this step. Accuracy dropped 27% when they used raw generated samples with no verification. Even more telling: when they replaced multi-agent debate with single-agent self-review, where the same model that generated the example also critiques it, performance was even worse than no verification at all. Without an opposing voice, the model just confirms what it already believed. It is not review. It is rationalization. Real disagreement is the whole mechanism.</p><p>Analysis of over 1,350 debates in the plan verification task alone showed that more than 30% of cases involved non-trivial dynamics: Judges starting in disagreement and converging through argument, or initial consensus breaking down after the Advocate&#8217;s reasoning was scrutinized. The debate was not a rubber stamp. It was doing real work.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!khlx!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e097c4d-5cdb-4399-b41a-81539ca3e338_1800x919.gif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!khlx!,w_424,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e097c4d-5cdb-4399-b41a-81539ca3e338_1800x919.gif 424w, https://substackcdn.com/image/fetch/$s_!khlx!,w_848,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e097c4d-5cdb-4399-b41a-81539ca3e338_1800x919.gif 848w, https://substackcdn.com/image/fetch/$s_!khlx!,w_1272,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e097c4d-5cdb-4399-b41a-81539ca3e338_1800x919.gif 1272w, https://substackcdn.com/image/fetch/$s_!khlx!,w_1456,c_limit,f_webp,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e097c4d-5cdb-4399-b41a-81539ca3e338_1800x919.gif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!khlx!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e097c4d-5cdb-4399-b41a-81539ca3e338_1800x919.gif" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1e097c4d-5cdb-4399-b41a-81539ca3e338_1800x919.gif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:null,&quot;width&quot;:null,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!khlx!,w_424,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e097c4d-5cdb-4399-b41a-81539ca3e338_1800x919.gif 424w, https://substackcdn.com/image/fetch/$s_!khlx!,w_848,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e097c4d-5cdb-4399-b41a-81539ca3e338_1800x919.gif 848w, https://substackcdn.com/image/fetch/$s_!khlx!,w_1272,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e097c4d-5cdb-4399-b41a-81539ca3e338_1800x919.gif 1272w, https://substackcdn.com/image/fetch/$s_!khlx!,w_1456,c_limit,f_auto,q_auto:good,fl_lossy/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e097c4d-5cdb-4399-b41a-81539ca3e338_1800x919.gif 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><div><hr></div><h2>What Comes Out the Other Side</h2><p>The fine-tuned models BARRED produces are small and genuinely surprising in how well they perform.</p><p>Tested across four distinct domains, customer service dialogue compliance, AI agent plan verification, and healthcare regulatory classification, a 3-billion parameter model trained on BARRED&#8217;s synthetic data consistently beat GPT-4.1. It also outperformed dedicated safety models with significantly more parameters across every benchmark.</p><p>Simpler rules, like detecting when a user repeats themselves three times, saturate at smaller model sizes. You do not need a big model for a clear-cut rule. Complex rules, like nuanced privacy violations, benefit from more capacity. This means you can size your guardrail to the actual complexity of what it enforces. Nothing wasted.</p><p>And on a practical level, the difference between a 3-billion parameter classifier and GPT running on every single user interaction is enormous. In cost. In latency. In what you can actually afford to do at scale. The accuracy win is great. The efficiency win is what makes this real.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!h5qv!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5059417-0657-46c0-b8fd-c124197b880d_2816x1536.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!h5qv!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5059417-0657-46c0-b8fd-c124197b880d_2816x1536.png 424w, https://substackcdn.com/image/fetch/$s_!h5qv!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5059417-0657-46c0-b8fd-c124197b880d_2816x1536.png 848w, https://substackcdn.com/image/fetch/$s_!h5qv!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5059417-0657-46c0-b8fd-c124197b880d_2816x1536.png 1272w, https://substackcdn.com/image/fetch/$s_!h5qv!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5059417-0657-46c0-b8fd-c124197b880d_2816x1536.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!h5qv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5059417-0657-46c0-b8fd-c124197b880d_2816x1536.png" width="1456" height="794" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d5059417-0657-46c0-b8fd-c124197b880d_2816x1536.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:794,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:5530737,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://diamantai.substack.com/i/195259260?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5059417-0657-46c0-b8fd-c124197b880d_2816x1536.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!h5qv!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5059417-0657-46c0-b8fd-c124197b880d_2816x1536.png 424w, https://substackcdn.com/image/fetch/$s_!h5qv!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5059417-0657-46c0-b8fd-c124197b880d_2816x1536.png 848w, https://substackcdn.com/image/fetch/$s_!h5qv!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5059417-0657-46c0-b8fd-c124197b880d_2816x1536.png 1272w, https://substackcdn.com/image/fetch/$s_!h5qv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd5059417-0657-46c0-b8fd-c124197b880d_2816x1536.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div><hr></div><h2>You Can Use This Today</h2><p>The research code is public, but Plurai also went a step further: they built a full UI around BARRED so you do not need to touch any code at all. You describe your policy, upload your examples, and the platform runs the entire pipeline for you. They also ship an MCP server, which means you can plug BARRED directly into your existing AI development workflow and trigger guardrail generation from the tools you are already using.</p><p>You can find it at:</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://plurai.ai/launch?utm_source=diamantai&amp;utm_medium=newsletter&amp;utm_campaign=launch_2026_newsletter&amp;utm_content=main&quot;,&quot;text&quot;:&quot;Plurai website&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://plurai.ai/launch?utm_source=diamantai&amp;utm_medium=newsletter&amp;utm_campaign=launch_2026_newsletter&amp;utm_content=main"><span>Plurai website</span></a></p><p>The gap between &#8220;we handle this with prompts and hope&#8221; and &#8220;we have a dedicated trained classifier that knows our rules cold&#8221; just got much smaller. Not as a research possibility. As a thing you can build today.</p><p>The only thing left is the decision to stop patching and start building it right.</p><div><hr></div><p><em>Based on BARRED: Synthetic Training of Custom Policy Guardrails via Asymmetric Debate, by Arnon Mazza and Elad Levi, Plurai Inc., accepted at ICML 2026.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.diamant-ai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading <strong>&#128142;DiamantAI!</strong> I share cutting-edge AI insights, tutorials, and breakthroughs. Subscribe for free to get new posts delivered straight to your inbox, and as a bonus, you&#8217;ll receive a <strong>33% discount coupon</strong> for both of my digital books, <em><a href="https://nirdiamant.gumroad.com/l/rag-made-simple">RAG Made Simple</a></em> and <em><a href="https://nirdiamant.gumroad.com/l/mtxrfk">Prompt Engineering: From Zero to Hero</a></em>. Enjoy!</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p></p>]]></content:encoded></item><item><title><![CDATA[The RAG Techniques Guide I Wish I Had]]></title><description><![CDATA[Based on my 27K star GitHub repository]]></description><link>https://newsletter.diamant-ai.com/p/the-rag-techniques-guide-i-wish-i</link><guid isPermaLink="false">https://newsletter.diamant-ai.com/p/the-rag-techniques-guide-i-wish-i</guid><dc:creator><![CDATA[Nir Diamant]]></dc:creator><pubDate>Wed, 08 Apr 2026 11:43:53 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!8yBD!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdfa10a27-4450-4396-945d-e7bc0644e9f3_2048x2048.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hi everyone!</p><p>I have a big update today. After a year and a half of work on the &#8220;RAG Techniques&#8221; repository on GitHub (27K stars), which is used by hundreds of thousands of developers, I finally finished the book version.</p><p>I want my subscribers to get the maximum benefit from this launch. <strong>For the first 24 hours</strong>, the Kindle version is <strong>0.99 dollars</strong>. This is the lowest price Amazon allows. <strong>After tomorrow, the price will go up to the standard rate</strong>.</p><p>The book has 22 chapters. It covers everything from the basics to advanced topics like Graph RAG and Evaluation. I added many illustrations and decision guides to help you choose the right technique for your specific data.</p><p>If you enjoy the content and want to support the project, leave a 5-star rate and a nice written review.</p><p>Reviews are very important during the first day. They tell the Amazon algorithm that this book is helpful for developers. This support allows me to keep the repository updated and create more content for you.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://amzn.to/4cvxqSw&quot;,&quot;text&quot;:&quot;Get the book here for 0.99 dollars&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://amzn.to/4cvxqSw"><span>Get the book here for 0.99 dollars</span></a></p><p>Thank you for being part of this journey.</p><p>Nir</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.diamant-ai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading <strong>&#128142;DiamantAI!</strong> I share cutting-edge AI insights, tutorials, and breakthroughs. Subscribe for free to get new posts delivered straight to your inbox, and as a bonus, you&#8217;ll receive a <strong>33% discount coupon</strong> for both of my digital books, <em><a href="https://nirdiamant.gumroad.com/l/rag-made-simple">RAG Made Simple</a></em> and <em><a href="https://nirdiamant.gumroad.com/l/mtxrfk">Prompt Engineering: From Zero to Hero</a></em>. Enjoy!</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!8yBD!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdfa10a27-4450-4396-945d-e7bc0644e9f3_2048x2048.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!8yBD!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdfa10a27-4450-4396-945d-e7bc0644e9f3_2048x2048.png 424w, https://substackcdn.com/image/fetch/$s_!8yBD!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdfa10a27-4450-4396-945d-e7bc0644e9f3_2048x2048.png 848w, https://substackcdn.com/image/fetch/$s_!8yBD!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdfa10a27-4450-4396-945d-e7bc0644e9f3_2048x2048.png 1272w, https://substackcdn.com/image/fetch/$s_!8yBD!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdfa10a27-4450-4396-945d-e7bc0644e9f3_2048x2048.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!8yBD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdfa10a27-4450-4396-945d-e7bc0644e9f3_2048x2048.png" width="1456" height="1456" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/dfa10a27-4450-4396-945d-e7bc0644e9f3_2048x2048.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1456,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:6624861,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://diamantai.substack.com/i/193559314?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdfa10a27-4450-4396-945d-e7bc0644e9f3_2048x2048.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!8yBD!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdfa10a27-4450-4396-945d-e7bc0644e9f3_2048x2048.png 424w, https://substackcdn.com/image/fetch/$s_!8yBD!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdfa10a27-4450-4396-945d-e7bc0644e9f3_2048x2048.png 848w, https://substackcdn.com/image/fetch/$s_!8yBD!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdfa10a27-4450-4396-945d-e7bc0644e9f3_2048x2048.png 1272w, https://substackcdn.com/image/fetch/$s_!8yBD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdfa10a27-4450-4396-945d-e7bc0644e9f3_2048x2048.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p>]]></content:encoded></item><item><title><![CDATA[They Copied a Fly's Brain Into a Computer. It Started Walking. Nobody Taught It How]]></title><description><![CDATA[What a fruit fly connectome reveals about intelligence, and why it should change how you think about AI]]></description><link>https://newsletter.diamant-ai.com/p/they-copied-a-flys-brain-into-a-computer</link><guid isPermaLink="false">https://newsletter.diamant-ai.com/p/they-copied-a-flys-brain-into-a-computer</guid><dc:creator><![CDATA[Nir Diamant]]></dc:creator><pubDate>Thu, 19 Mar 2026 10:53:20 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!TEWw!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a16d62c-41fc-4e3f-bc19-483afd8daf1e_1408x768.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hi Folks,</p><p>Here is the question I want you to sit with as you read this:</p><p>Every AI system you have ever worked with learned its behavior from data. You define a loss function, run gradient descent, and the model adjusts its weights until it does what you want. That is how every LLM works. That is how every reinforcement learning agent works.</p><p>But a fruit fly never did any of that. Nobody trained it. No reward signal. No labeled examples. The behavior emerged from the wiring of its brain.</p><p>Researchers just demonstrated that if you copy that wiring into a computer, the exact same thing happens. The behavior emerges. Without training.</p><p>That is not just a neuroscience story. It is a direct challenge to one of the core assumptions of modern AI.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!qLt7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f3c3874-346f-43e3-ba12-ed044cf61c49_1408x768.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!qLt7!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f3c3874-346f-43e3-ba12-ed044cf61c49_1408x768.png 424w, https://substackcdn.com/image/fetch/$s_!qLt7!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f3c3874-346f-43e3-ba12-ed044cf61c49_1408x768.png 848w, https://substackcdn.com/image/fetch/$s_!qLt7!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f3c3874-346f-43e3-ba12-ed044cf61c49_1408x768.png 1272w, https://substackcdn.com/image/fetch/$s_!qLt7!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f3c3874-346f-43e3-ba12-ed044cf61c49_1408x768.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!qLt7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f3c3874-346f-43e3-ba12-ed044cf61c49_1408x768.png" width="1408" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3f3c3874-346f-43e3-ba12-ed044cf61c49_1408x768.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1408,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1037231,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://diamantai.substack.com/i/191462468?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f3c3874-346f-43e3-ba12-ed044cf61c49_1408x768.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!qLt7!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f3c3874-346f-43e3-ba12-ed044cf61c49_1408x768.png 424w, https://substackcdn.com/image/fetch/$s_!qLt7!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f3c3874-346f-43e3-ba12-ed044cf61c49_1408x768.png 848w, https://substackcdn.com/image/fetch/$s_!qLt7!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f3c3874-346f-43e3-ba12-ed044cf61c49_1408x768.png 1272w, https://substackcdn.com/image/fetch/$s_!qLt7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f3c3874-346f-43e3-ba12-ed044cf61c49_1408x768.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div><hr></div><h2>Two Theories of Intelligence</h2><p>Modern AI is fundamentally top-down. You have a goal, you measure it, and you optimize toward it. The wiring of the network gets shaped entirely by the training process.</p><p>Biology does the opposite. The brain&#8217;s wiring is determined by genetics and development before the animal ever experiences the world. Structure comes first. Behavior emerges from it.</p><p><strong>AI&#8217;s bet:</strong> wire up a flexible system, expose it to enough data, and intelligence will emerge from optimization.</p><p><strong>Biology&#8217;s bet:</strong> build the right wiring diagram, and intelligence is already latent in the architecture.</p><p>Until recently, we had no way to test biology&#8217;s approach at scale. We did not have complete wiring diagrams for any complex brain. That just changed.</p><div><hr></div><h2>What They Built</h2><p>A connectome is a complete wiring diagram of a brain. Every neuron. Every synapse. Every connection, labeled excitatory or inhibitory.</p><p>In October 2024, an international research consortium published the first complete connectome of an adult fruit fly brain in Nature. It took 10 years and 7,000 brain slices imaged with electron microscopes. The result: <strong>139,255 neurons and 50 million synaptic connections</strong>, fully open source.</p><p>A team of researchers then built a computational model of the entire brain from that map, using machine learning to predict what neurotransmitter each neuron releases. They simulated the whole network.</p><p>It ran on a laptop.</p><p>When they activated the neurons that sense sugar, the model predicted exactly which downstream neurons would fire to initiate feeding. When they activated sensory neurons in the antennae, the model predicted grooming behavior with the front legs. Exactly what a real fly does. <strong>95% accuracy</strong> in predicting motor behavior. No training data. No reward function. No gradient descent.</p><p>The behaviors were already in the wiring.</p><div><hr></div><h2>Giving It a Body</h2><p>The brain model had outputs with nowhere to go. No muscles. No physics. A conductor with a full orchestra score and no musicians.</p><p>The next step: connect the brain model to a physics-simulated fly body, 87 independently articulated joints built from an X-ray scan of a real fruit fly. Then close the loop. Sensory input flows in, signals propagate through all 139,255 neurons, motor commands come out, the body moves, and movement updates the sensory state. Repeat.</p><p>The digital fly walks toward food, stops to groom dust off its antennae, then resumes and feeds.</p><p>Nobody programmed those behaviors. They emerged from the circuit.</p><p>This is fundamentally different from recent AI work where a simulated fly body learned to walk via reinforcement learning. That approach trains a policy to mimic biological movement. This approach copies the biological wiring and lets the movement come from it. The difference is between a stunt double who studied your walk, versus making a copy of your nervous system and watching it walk on its own.</p><div><hr></div><h2>What This Means in Practice</h2><p>This is where it gets relevant for builders.</p><p><strong>Biological architectures as initialization, not just inspiration.</strong> Right now, neuroscience influences AI mostly as metaphor. Transformers were loosely inspired by attention in the brain, but &#8220;inspired by&#8221; is very different from &#8220;derived from.&#8221; This research shows that a connectome-derived architecture used directly as a network structure produces coherent behavior before any learning happens. The practical question for AI researchers: what if biologically accurate wiring diagrams were used as structured initializations instead of random weights? You would be starting from millions of years of evolutionary optimization, not from noise.</p><p><strong>A new path to low-data regimes.</strong> One of the hardest problems in applied AI is good performance with limited training data. Current models need enormous datasets because they learn everything from scratch, including structure. A connectome-constrained model starts with structure already encoded. It needs far less data to produce meaningful behavior because the architecture is doing work that training would otherwise have to do. This matters directly for domains where labeled data is scarce: rare disease research, robotics in novel environments, edge deployments with minimal compute.</p><p><strong>Mechanistic interpretability, for real.</strong> The AI safety field has spent years trying to understand what is happening inside large models. It is hard because the structure of modern networks carries no inherent meaning. In a connectome-based model, every connection has a biological identity. You know what neuron type it is, what neurotransmitter it uses, what circuit it belongs to. When a behavior emerges, you can trace it back through the graph to the specific connections that produced it. That is not a partial explanation. It is a complete causal account. If interpretability is something you care about, this is the most interpretable class of neural model that exists.</p><p><strong>Drug and therapy development.</strong> A working simulation of a biological brain circuit means you can test interventions digitally before touching a patient. Introduce a simulated neurotoxin. Block a specific receptor type. Sever a connection. Watch what behavior changes. This compresses the earliest stages of drug discovery dramatically, and gives researchers hypotheses they can trace back to mechanism. As the approach scales toward mammalian brains, it becomes one of the most valuable tools in pharmaceutical research.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!TEWw!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a16d62c-41fc-4e3f-bc19-483afd8daf1e_1408x768.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!TEWw!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a16d62c-41fc-4e3f-bc19-483afd8daf1e_1408x768.png 424w, https://substackcdn.com/image/fetch/$s_!TEWw!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a16d62c-41fc-4e3f-bc19-483afd8daf1e_1408x768.png 848w, https://substackcdn.com/image/fetch/$s_!TEWw!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a16d62c-41fc-4e3f-bc19-483afd8daf1e_1408x768.png 1272w, https://substackcdn.com/image/fetch/$s_!TEWw!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a16d62c-41fc-4e3f-bc19-483afd8daf1e_1408x768.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!TEWw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a16d62c-41fc-4e3f-bc19-483afd8daf1e_1408x768.png" width="1408" height="768" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6a16d62c-41fc-4e3f-bc19-483afd8daf1e_1408x768.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:768,&quot;width&quot;:1408,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:988455,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://diamantai.substack.com/i/191462468?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a16d62c-41fc-4e3f-bc19-483afd8daf1e_1408x768.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!TEWw!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a16d62c-41fc-4e3f-bc19-483afd8daf1e_1408x768.png 424w, https://substackcdn.com/image/fetch/$s_!TEWw!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a16d62c-41fc-4e3f-bc19-483afd8daf1e_1408x768.png 848w, https://substackcdn.com/image/fetch/$s_!TEWw!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a16d62c-41fc-4e3f-bc19-483afd8daf1e_1408x768.png 1272w, https://substackcdn.com/image/fetch/$s_!TEWw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6a16d62c-41fc-4e3f-bc19-483afd8daf1e_1408x768.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div><hr></div><h2>The Bigger Point</h2><p>The roadmap from here goes fly, mouse, human. A mouse brain has 70 million neurons, about 560 times more than the fly. The scaling is hard. But the approach is now proven.</p><p>More importantly, this research forces a question the field has not had to take seriously before: what if the right structure is doing most of the work, and we have been over-crediting optimization this whole time?</p><p>Modern AI runs almost entirely on the assumption that training is where intelligence comes from. This is the clearest demonstration yet that structure alone is enough to produce real behavior in a complex system.</p><p>The most interesting work in the next decade will probably sit at the intersection of both. Not connectome models versus learned models, but what happens when you combine structural priors from biology with the optimization power of machine learning. Start from the right architecture. Then learn.</p><p>A fly is walking around in a physics simulation right now because someone mapped its neurons and ran the circuit. No training loop. No reward signal. Just structure, and the behavior was already there.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.diamant-ai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading <strong>&#128142;DiamantAI!</strong> I share cutting-edge AI insights, tutorials, and breakthroughs. Subscribe for free to get new posts delivered straight to your inbox, and as a bonus, you&#8217;ll receive a <strong>33% discount coupon</strong> for both of my digital books, <em><a href="https://nirdiamant.gumroad.com/l/rag-made-simple">RAG Made Simple</a></em> and <em><a href="https://nirdiamant.gumroad.com/l/mtxrfk">Prompt Engineering: From Zero to Hero</a></em>. Enjoy!</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p></p>]]></content:encoded></item><item><title><![CDATA[OpenClaw Tutorial - Build an AI Agent That Manages Your Bills and Sends You a Daily Briefing on WhatsApp]]></title><description><![CDATA[A step-by-step tutorial with OpenClaw - from installation to a working morning briefing on your phone]]></description><link>https://newsletter.diamant-ai.com/p/openclaw-tutorial-build-an-ai-agent</link><guid isPermaLink="false">https://newsletter.diamant-ai.com/p/openclaw-tutorial-build-an-ai-agent</guid><dc:creator><![CDATA[Nir Diamant]]></dc:creator><pubDate>Sun, 22 Feb 2026 18:14:18 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!W788!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0de9167d-a160-42ea-abcb-e5f3da169422_845x772.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey all, </p><p>this week the blog post if more hands on and practical. with all the buzz around OpenClaw, here is a short tutorial on how to use it (with a bit of a background).</p><p>Everyone has missed a bill, forgotten to renew a subscription, or lost track of an appointment buried somewhere between email, WhatsApp, and a PDF they downloaded three weeks ago. Not because they are careless, but because life admin is scattered across too many places and none of it is important enough to remember - until it suddenly is.</p><p>Now instead of this, think about an assistant living inside your WhatsApp notices a renewal email arrive, adds &#8220;renew car insurance by Thursday&#8221; to your task list, and sends you a nudge on Wednesday evening. If you reply &#8220;handle it,&#8221; it opens the insurance portal in a browser, fills in your details, and asks for one confirmation before submitting.</p><p>You can build this today with <strong>OpenClaw</strong>, an open-source AI agent framework that runs on your own machine. This post walks through every step - from installation to a working morning briefing - with the actual commands and configuration files you need to get it running.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.diamant-ai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.diamant-ai.com/subscribe?"><span>Subscribe now</span></a></p><div><hr></div><h2>Why Life Admin Is the Perfect First Agent</h2><p>Everyone has life admin. It does not matter if you are a developer, a teacher, a freelancer, or a parent. Bills, appointments, subscriptions, warranty claims, school forms, tax documents, prescription refills. The list never ends, and none of it is hard. It is just <em>there</em>, constantly, draining a little bit of your attention every day.</p><p>This makes it an ideal job for an autonomous agent. The tasks are repetitive. The stakes on each individual task are low. The information is scattered across email, chat messages, PDFs, and websites. And most of it follows patterns: something arrives, you need to read it, decide if it matters, and either act on it or schedule it for later.</p><p>OpenClaw fits this because it was designed to live inside chat apps (WhatsApp, Telegram, Slack, Discord), maintain long-term memory, browse the web, handle files, and take action through connected tools. Instead of yet another app to check, your assistant meets you where you already spend your time.</p><div><hr></div><h2>What OpenClaw Is (60-Second Version)</h2><p>A language model like Claude or GPT is a brain with no body. It can reason and write, but it cannot remember yesterday, open a website, or send you a message at 8 AM.</p><p>OpenClaw gives that brain a body. It wraps any language model in an environment with long-term memory, tool access, browser control, file handling, and connections to your actual chat apps. Messages flow in through a gateway, OpenClaw builds context from your conversation history and available tools, sends everything to the model, executes whatever actions the model requests, and delivers the response back to your chat.</p><p>Three layers make it work:</p><p><strong>Channel layer.</strong> WhatsApp, Telegram, Slack, and other apps all connect to one gateway. You talk to the same assistant from any app.</p><p><strong>Brain layer.</strong> Your agent sits here with its instructions, personality, and access to one or more language models - cloud or local.</p><p><strong>Body layer.</strong> Tools, browser automation, file access, and long-term memory live here. This is what lets the agent actually <em>do</em> things.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!W788!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0de9167d-a160-42ea-abcb-e5f3da169422_845x772.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!W788!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0de9167d-a160-42ea-abcb-e5f3da169422_845x772.png 424w, https://substackcdn.com/image/fetch/$s_!W788!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0de9167d-a160-42ea-abcb-e5f3da169422_845x772.png 848w, https://substackcdn.com/image/fetch/$s_!W788!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0de9167d-a160-42ea-abcb-e5f3da169422_845x772.png 1272w, https://substackcdn.com/image/fetch/$s_!W788!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0de9167d-a160-42ea-abcb-e5f3da169422_845x772.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!W788!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0de9167d-a160-42ea-abcb-e5f3da169422_845x772.png" width="845" height="772" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0de9167d-a160-42ea-abcb-e5f3da169422_845x772.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:772,&quot;width&quot;:845,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:181163,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://diamantai.substack.com/i/188816145?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0de9167d-a160-42ea-abcb-e5f3da169422_845x772.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!W788!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0de9167d-a160-42ea-abcb-e5f3da169422_845x772.png 424w, https://substackcdn.com/image/fetch/$s_!W788!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0de9167d-a160-42ea-abcb-e5f3da169422_845x772.png 848w, https://substackcdn.com/image/fetch/$s_!W788!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0de9167d-a160-42ea-abcb-e5f3da169422_845x772.png 1272w, https://substackcdn.com/image/fetch/$s_!W788!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0de9167d-a160-42ea-abcb-e5f3da169422_845x772.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The piece that makes a life admin agent possible: OpenClaw can run in the background, receive triggers, execute scheduled tasks, and remember everything across sessions. It does not just answer when asked. It watches, reminds, and acts.</p><div><hr></div><h2>What You Will Need</h2><p>Before we start, make sure you have:</p><ul><li><p><strong>Node.js 22 or later</strong> (the installer can handle this, but having it ready saves time)</p></li><li><p><strong>An Anthropic API key</strong> for Claude (sign up at console.anthropic.com)</p></li><li><p><strong>A phone with WhatsApp</strong> - your personal account works fine, no business account needed</p></li><li><p><strong>A machine that stays on</strong> - your laptop works for testing; for always-on you will want a server or old desktop (covered at the end)</p></li></ul><div><hr></div><h2>Step 1: Install OpenClaw</h2><p>Open your terminal and run one command:</p><pre><code><code>curl -fsSL https://openclaw.ai/install.sh | bash</code></code></pre><p>On Windows, use PowerShell instead:</p><pre><code><code>iwr -useb https://openclaw.ai/install.ps1 | iex</code></code></pre><p>The installer downloads everything, walks you through initial setup, and creates your config file and workspace directory. Once it finishes, verify the installation:</p><pre><code><code>openclaw doctor
openclaw status</code></code></pre><p>You should see green checks. If <code>doctor</code> flags anything missing, follow its suggestions before continuing.</p><p>After installation, your workspace lives at <code>~/.openclaw/</code>. The directory looks like this:</p><pre><code><code>~/.openclaw/
  openclaw.json              &#8592; Main configuration file
  credentials/               &#8592; OAuth tokens, API keys
  workspace/
    SOUL.md                  &#8592; Agent personality and boundaries
    USER.md                  &#8592; Info about you
    AGENTS.md                &#8592; Operating instructions
    HEARTBEAT.md             &#8592; What to check periodically
    MEMORY.md                &#8592; Long-term curated memory
    memory/                  &#8592; Daily memory logs
  cron/jobs.json             &#8592; Scheduled tasks</code></code></pre><p>Every file that shapes your agent&#8217;s behavior is a plain Markdown file you can open in any editor. No black boxes.</p><div><hr></div><h2>Step 2: Write Your Agent&#8217;s Job Description</h2><p>Before configuring anything, write a plain-language job description for your agent. This is not busywork - OpenClaw loads these files into every single conversation, so they directly shape behavior. Think of them as the agent&#8217;s operating manual.</p><p>Create three files in your workspace.</p><p><code>~/.openclaw/workspace/SOUL.md</code> defines who the agent <em>is</em>:</p><pre><code><code># Soul

You are a personal life admin assistant. You are calm, organized,
and concise.

## What you do
- Track bills, appointments, deadlines, and tasks from my messages
- Send a morning briefing every day with what needs attention
- Use browser automation to check portals and download documents
- Fill out simple forms and send me a screenshot before submitting

## What you never do
- Submit payments without my explicit confirmation
- Delete any files, messages, or data
- Share personal information with third parties
- Make decisions on legal or medical matters &#8212; always ask me
- Send messages to anyone other than me

## How you communicate
- Keep messages short. Bullet points for lists.
- For anything involving money or deadlines, quote the exact source
  and ask for confirmation before acting.
- Batch low-priority items into the morning briefing.
- Only send real-time messages for things due today.</code></code></pre><p><code>~/.openclaw/workspace/USER.md</code> tells the agent about you:</p><pre><code><code># User Profile

- Name: [Your name]
- Timezone: America/New_York
- Key accounts: electricity (ConEd), internet (Spectrum),
  insurance (State Farm)
- Morning briefing time: 8:00 AM
- Preferred reminder time: evening before something is due</code></code></pre><p><code>~/.openclaw/workspace/AGENTS.md</code> sets operational rules:</p><pre><code><code># Operating Instructions

## Memory
- When you learn a new recurring bill or deadline, save it to MEMORY.md
- Track bill amounts over time so you can flag unusual changes

## Tasks
- Confirm tasks with me before adding them
- Re-surface tasks I have not acted on after 2 days

## Documents
- When I share a bill, extract: vendor, amount, due date, account number
- Save extracted info to the daily memory log

## Browser
- Always screenshot after filling a form &#8212; send it before submitting
- Never click "Submit," "Pay," or "Confirm" without my approval
- If a website looks different from expected, stop and ask me</code></code></pre><p>Notice the boundaries. The agent can read, organize, remind, and prepare. But it cannot spend money, delete data, or act on sensitive matters without asking. These limits are what let you sleep while the agent works.</p><div><hr></div><h2>Step 3: Connect WhatsApp</h2><p>This is where the assistant comes alive. OpenClaw uses the Baileys library to connect to WhatsApp Web - your assistant appears as your own number&#8217;s session, like opening WhatsApp Web on a new browser.</p><p>Open <code>~/.openclaw/openclaw.json</code> and configure the WhatsApp channel:</p><pre><code><code>{
  "auth": {
    "token": "pick-any-random-string-here"
  },

  "channels": {
    "whatsapp": {
      "dmPolicy": "allowlist",
      "allowFrom": ["+15551234567"],
      "groupPolicy": "disabled",
      "sendReadReceipts": true,
      "mediaMaxMb": 50
    }
  }
}</code></code></pre><p>Replace <code>+15551234567</code> with your actual phone number in international format. The <code>allowlist</code> policy means the agent only responds to <em>your</em> messages - everyone else is ignored.</p><p>Now start the gateway and link your phone:</p><pre><code><code>openclaw gateway
openclaw channels login --channel whatsapp</code></code></pre><p>A QR code appears in your terminal. Open WhatsApp on your phone, go to <strong>Linked Devices</strong>, and scan it. Then approve the pairing:</p><pre><code><code>openclaw pairing list whatsapp
openclaw pairing approve whatsapp &lt;CODE&gt;</code></code></pre><p>Send yourself a test message. The agent should reply. If nothing happens, run <code>openclaw status</code> and check the logs.</p><div><hr></div><h2>Step 4: Configure the Models</h2><p>A hybrid model strategy keeps costs low and quality high. Route the heavy thinking - understanding a medical bill, summarizing a lease agreement - to a strong cloud model. Route the lightweight background checks to something cheaper.</p><p>Add this to your <code>openclaw.json</code>:</p><pre><code><code>{
  "agents": {
    "defaults": {
      "model": {
        "primary": "anthropic/claude-sonnet-4-5",
        "fallbacks": ["anthropic/claude-haiku-3-5"]
      },
      "heartbeat": {
        "every": "30m",
        "model": "anthropic/claude-haiku-3-5",
        "target": "last",
        "activeHours": {
          "start": 7,
          "end": 23,
          "timezone": "America/New_York"
        }
      }
    },
    "list": [
      {
        "id": "admin",
        "default": true,
        "name": "Life Admin Assistant",
        "workspace": "~/.openclaw/workspace",
        "identity": { "name": "Admin" }
      }
    ]
  }
}</code></code></pre><p>Set your API key in your shell profile (<code>~/.zshrc</code> or <code>~/.bashrc</code>):</p><pre><code><code>export ANTHROPIC_API_KEY="sk-ant-your-key-here"</code></code></pre><p>Reload and restart:</p><pre><code><code>source ~/.zshrc
openclaw gateway</code></code></pre><p>Sonnet handles the real reasoning. Haiku handles the frequent background checks at a fraction of the cost. If either fails, OpenClaw automatically falls back to the next model in the list.</p><p><strong>Optional: keep sensitive data off the cloud entirely.</strong> If you run <strong>Ollama</strong> locally, you can add a local model and instruct the agent (in <code>SOUL.md</code>) to use it for anything containing medical records or full account numbers:</p><pre><code><code>{
  "agents": {
    "defaults": {
      "models": {
        "local": {
          "provider": {
            "type": "openai-compatible",
            "baseURL": "http://localhost:11434/v1",
            "modelId": "llama3.1:8b"
          }
        }
      }
    }
  }
}</code></code></pre><div><hr></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.diamant-ai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.diamant-ai.com/subscribe?"><span>Subscribe now</span></a></p><h2>Step 5: Give It Hands</h2><p>An assistant that can only talk is just a chatbot. Your life admin agent needs to interact with real systems.</p><h3>Enable the browser</h3><p>This is what lets the agent open portals, check balances, and fill forms:</p><pre><code><code>{
  "browser": {
    "enabled": true,
    "headless": false,
    "defaultProfile": "openclaw"
  }
}</code></code></pre><p>Test it:</p><pre><code><code>openclaw browser start
openclaw browser open https://example.com
openclaw browser snapshot
openclaw browser stop</code></code></pre><p>The <code>snapshot</code> command returns an AI-readable tree of every element on the page, each tagged with a reference ID. The agent uses these refs to click buttons, type in fields, and navigate - no CSS selectors or brittle scripts. It reasons about what it <em>sees</em>, the same way you would.</p><h3>Connect external tools via MCP</h3><p>OpenClaw supports the Model Context Protocol - a universal adapter between your agent and external services. Each tool is a small server your agent can call:</p><pre><code><code>{
  "agents": {
    "defaults": {
      "mcpServers": {
        "filesystem": {
          "command": "npx",
          "args": [
            "-y",
            "@modelcontextprotocol/server-filesystem",
            "/home/you/documents/admin"
          ]
        },
        "google-calendar": {
          "command": "npx",
          "args": ["-y", "@anthropic/mcp-server-google-calendar"],
          "env": {
            "GOOGLE_CLIENT_ID": "${GOOGLE_CLIENT_ID}",
            "GOOGLE_CLIENT_SECRET": "${GOOGLE_CLIENT_SECRET}"
          }
        }
      },
      "tools": {
        "allow": [
          "exec", "read", "write", "edit",
          "browser", "web_search", "web_fetch",
          "memory_search", "memory_get",
          "message", "cron"
        ],
        "deny": ["gateway"]
      }
    }
  }
}</code></code></pre><p>There are over a thousand community MCP servers - Google Drive, Todoist, Notion, Slack, Gmail, and more. Add only what you need.</p><h3>What a browser task looks like in practice</h3><p>You say: <em>&#8220;Check how much my phone bill is this month.&#8221;</em></p><p>Here is what happens:</p><ol><li><p>The agent opens your carrier&#8217;s portal</p></li><li><p>It takes a snapshot - an AI-readable map of every element on the page</p></li><li><p>It finds the login fields, types your credentials, clicks Sign In</p></li><li><p>It navigates to the billing section</p></li><li><p>It reads the amount and replies:</p></li></ol><blockquote><p>&#8220;Your phone bill for January is $47.30, due February 15. Want me to add a reminder?&#8221;</p></blockquote><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Sxrx!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F497dbd2b-2639-4683-a06f-53e27e363f5f_485x977.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Sxrx!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F497dbd2b-2639-4683-a06f-53e27e363f5f_485x977.png 424w, https://substackcdn.com/image/fetch/$s_!Sxrx!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F497dbd2b-2639-4683-a06f-53e27e363f5f_485x977.png 848w, https://substackcdn.com/image/fetch/$s_!Sxrx!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F497dbd2b-2639-4683-a06f-53e27e363f5f_485x977.png 1272w, https://substackcdn.com/image/fetch/$s_!Sxrx!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F497dbd2b-2639-4683-a06f-53e27e363f5f_485x977.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Sxrx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F497dbd2b-2639-4683-a06f-53e27e363f5f_485x977.png" width="485" height="977" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/497dbd2b-2639-4683-a06f-53e27e363f5f_485x977.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:977,&quot;width&quot;:485,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:91992,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://diamantai.substack.com/i/188816145?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F497dbd2b-2639-4683-a06f-53e27e363f5f_485x977.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Sxrx!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F497dbd2b-2639-4683-a06f-53e27e363f5f_485x977.png 424w, https://substackcdn.com/image/fetch/$s_!Sxrx!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F497dbd2b-2639-4683-a06f-53e27e363f5f_485x977.png 848w, https://substackcdn.com/image/fetch/$s_!Sxrx!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F497dbd2b-2639-4683-a06f-53e27e363f5f_485x977.png 1272w, https://substackcdn.com/image/fetch/$s_!Sxrx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F497dbd2b-2639-4683-a06f-53e27e363f5f_485x977.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>No scripts to maintain. No API to integrate. The agent just reads the page and acts, like you would - except it never forgets to check.</p><div><hr></div><h2>Step 6: Set Up the Morning Briefing</h2><p>This is the feature that makes the agent feel like a real assistant instead of a reactive chatbot.</p><p>Create a cron job that runs every morning:</p><pre><code><code>openclaw cron add \
  --name "Morning Briefing" \
  --cron "0 8 * * *" \
  --tz "America/New_York" \
  --session isolated \
  --message "Compile the morning briefing. Check: (1) calendar events for today and tomorrow, (2) tasks due this week, (3) any items captured in the last 24 hours, (4) anything flagged as needing a decision. Format as a single concise WhatsApp message." \
  --announce \
  --channel whatsapp \
  --to "user:+15551234567"</code></code></pre><p>Verify it was created:</p><pre><code><code>openclaw cron list</code></code></pre><p>Every morning at 8 AM, the agent spins up a fresh session, pulls your calendar, checks your task list, scans recent memory, and sends you a single message.</p><p>One message. Everything you need to know. No digging through six different apps.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Jw-5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F581d5a69-9251-45f2-a90a-e978e2d80405_485x1057.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Jw-5!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F581d5a69-9251-45f2-a90a-e978e2d80405_485x1057.png 424w, https://substackcdn.com/image/fetch/$s_!Jw-5!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F581d5a69-9251-45f2-a90a-e978e2d80405_485x1057.png 848w, https://substackcdn.com/image/fetch/$s_!Jw-5!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F581d5a69-9251-45f2-a90a-e978e2d80405_485x1057.png 1272w, https://substackcdn.com/image/fetch/$s_!Jw-5!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F581d5a69-9251-45f2-a90a-e978e2d80405_485x1057.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Jw-5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F581d5a69-9251-45f2-a90a-e978e2d80405_485x1057.png" width="485" height="1057" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/581d5a69-9251-45f2-a90a-e978e2d80405_485x1057.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1057,&quot;width&quot;:485,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:113370,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://diamantai.substack.com/i/188816145?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F581d5a69-9251-45f2-a90a-e978e2d80405_485x1057.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Jw-5!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F581d5a69-9251-45f2-a90a-e978e2d80405_485x1057.png 424w, https://substackcdn.com/image/fetch/$s_!Jw-5!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F581d5a69-9251-45f2-a90a-e978e2d80405_485x1057.png 848w, https://substackcdn.com/image/fetch/$s_!Jw-5!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F581d5a69-9251-45f2-a90a-e978e2d80405_485x1057.png 1272w, https://substackcdn.com/image/fetch/$s_!Jw-5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F581d5a69-9251-45f2-a90a-e978e2d80405_485x1057.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>Add continuous monitoring between briefings</h3><p>The morning briefing runs once a day. For the hours in between, the heartbeat system keeps watch. You already configured it to run every 30 minutes in Step 4. Now tell it what to look for by creating <code>~/.openclaw/workspace/HEARTBEAT.md</code>:</p><pre><code><code># Heartbeat Checklist

On each check:
1. Scan recent messages for bills, appointments, or deadlines.
   If found, capture in today's memory log.
2. If anything is due today or tomorrow and I have not acknowledged
   it, send a single nudge.
3. If nothing needs attention, return HEARTBEAT_OK.</code></code></pre><p>The <code>HEARTBEAT_OK</code> response is suppressed - you only hear from the agent when something actually matters. And because it uses the cheap Haiku model, each check costs fractions of a cent.</p><div><hr></div><h2>Step 7: Seed the Memory</h2><p>OpenClaw&#8217;s memory is what makes this agent smarter the longer you use it. Everything is stored as plain Markdown files on your disk.</p><p>Two layers work together:</p><ul><li><p><code>MEMORY.md</code> - Long-term curated facts loaded every session. Your recurring bills, annual deadlines, known patterns.</p></li><li><p><code>memory/YYYY-MM-DD.md</code> - Daily logs the agent appends to. Today&#8217;s and yesterday&#8217;s are auto-loaded.</p></li></ul><p>Seed <code>~/.openclaw/workspace/MEMORY.md</code> with what you already know:</p><pre><code><code># Recurring Bills

| Bill | Amount | Due | Auto-pay |
|------|--------|-----|----------|
| Electricity (ConEd) | ~$120 | 15th monthly | No |
| Internet (Spectrum) | $79.99 | 3rd monthly | Yes |
| Car Insurance (State Farm) | $1,240 | March 3 annually | No |
| Gym (Planet Fitness) | $49/mo | 1st monthly | Yes |
| Netflix | $22.99 | 8th monthly | Yes |

# Annual Deadlines

- Tax filing: April 15
- Car registration renewal: September
- Lease renewal: discuss 60 days before Aug 31 end date

# Preferences

- I prefer manual payments over auto-pay where possible
- Remind me about annual renewals 2 weeks in advance
- Flag subscriptions I might want to cancel 1 week before renewal</code></code></pre><p>This gives the agent a running start. As it processes your messages and checks portals over the coming weeks, it fills in the daily logs and updates <code>MEMORY.md</code> on its own. After a month, the briefings get noticeably sharper because the memory is richer.</p><div><hr></div><h2>Step 8: Test the Complete Flow</h2><p>Before you trust this with real admin, run through four tests.</p><p><strong>Test 1 - Task capture.</strong> Send a WhatsApp message:</p><blockquote><p>&#8220;I just got a notice that my car insurance renewal is due March 3 for $1,240. Can you track this?&#8221;</p></blockquote><p>The agent should acknowledge it, save the details to memory, and confirm when it will remind you.</p><p><strong>Test 2 - Browser check.</strong> Send:</p><blockquote><p>&#8220;Can you check my electricity bill on the ConEd website?&#8221;</p></blockquote><p>The agent should open the portal, navigate the login, find the billing page, and report back the amount and due date. (For the first run, you may need to help with two-factor authentication. After that, saved cookies handle it.)</p><p><strong>Test 3 - Morning briefing.</strong> Trigger it manually instead of waiting until tomorrow:</p><pre><code><code>openclaw cron list          # find the job ID
openclaw cron run &lt;jobId&gt;   # run it now</code></code></pre><p>Check WhatsApp - you should receive the briefing within a minute.</p><p><strong>Test 4 - Document handling.</strong> Send a photo of a bill through WhatsApp. The agent should extract the vendor, amount, and due date, save the info to memory, and offer to create a task.</p><div><hr></div><h2>Step 9: Understand the Real Risks</h2><p>This is where most guides get vague. Here are the specific things that can go wrong.</p><p><strong>The agent misreads something important.</strong> Language models can misinterpret a bill amount or miss a deadline buried in legal language. Your <code>SOUL.md</code> already handles this - it requires the agent to quote exact source text and ask for confirmation on anything involving money or deadlines. Verify this is working during your first week.</p><p><strong>The agent fills out a form incorrectly.</strong> Browser automation is powerful but brittle. Websites change layouts. The agent might put your zip code in the phone number field. Your <code>AGENTS.md</code> already enforces screenshot-before-submit. Never remove that rule. The agent fills everything in, sends you a screenshot, and waits for your &#8220;looks good.&#8221;</p><p><strong>Your personal data reaches cloud providers.</strong> When the agent sends your electricity bill to Claude for analysis, that data travels to Anthropic&#8217;s servers. If this concerns you, route sensitive analysis to a local model (the optional Ollama setup from Step 4). For most people, the tradeoff is acceptable - Anthropic does not train on API data.</p><p><strong>The agent sends too many messages.</strong> An overeager assistant that pings you 15 times a day becomes the problem it was supposed to solve. The <code>HEARTBEAT.md</code> instructions enforce batching, and <code>SOUL.md</code> sets the tone. If messages are still too frequent, increase the heartbeat interval from <code>"30m"</code> to <code>"2h"</code> in your config.</p><p><strong>You over-trust and stop checking.</strong> The most subtle risk. After a few weeks of reliable briefings, you stop verifying. Then it quietly misses something. Keep a weekly habit of spot-checking: send &#8220;show me everything due this week with your sources&#8221; and verify against your own records.</p><div><hr></div><h2>Step 10: Make It Always-On</h2><p>Your laptop going to sleep kills the gateway. For a real assistant, you need something that stays awake.</p><p><strong>Option A: The daemon.</strong> OpenClaw can install itself as a system service that survives reboots:</p><pre><code><code>openclaw onboard --install-daemon</code></code></pre><p>On macOS this creates a launchd service. On Linux, a systemd unit. It reconnects WhatsApp automatically after sleep/wake.</p><p><strong>Option B: A cheap VPS.</strong> A $5-8/month server (Hetzner, DigitalOcean, Railway) runs the agent 24/7. Set the browser to headless mode since there is no display:</p><pre><code><code>{
  "browser": {
    "enabled": true,
    "headless": true
  }
}</code></code></pre><p><strong>Option C: An old device.</strong> A Raspberry Pi 5 or a retired laptop on your home Wi-Fi handles this easily.</p><p>Whichever you choose, keep OpenClaw updated - it is actively developed and updates include security patches:</p><pre><code><code>openclaw update --channel stable</code></code></pre><div><hr></div><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.diamant-ai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading <strong>&#128142;DiamantAI!</strong> I share cutting-edge AI insights, tutorials, and breakthroughs. Subscribe for free to get new posts delivered straight to your inbox, and as a bonus, you&#8217;ll receive a <strong>33% discount coupon</strong> for both of my digital books, <em><a href="https://nirdiamant.gumroad.com/l/rag-made-simple">RAG Made Simple</a></em> and <em><a href="https://nirdiamant.gumroad.com/l/mtxrfk">Prompt Engineering: From Zero to Hero</a></em>. Enjoy!</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><h2></h2>]]></content:encoded></item><item><title><![CDATA[Your Moltbook agent is being targeted right now]]></title><description><![CDATA[So I built something to fix it.]]></description><link>https://newsletter.diamant-ai.com/p/your-moltbook-agent-is-being-targeted</link><guid isPermaLink="false">https://newsletter.diamant-ai.com/p/your-moltbook-agent-is-being-targeted</guid><dc:creator><![CDATA[Nir Diamant]]></dc:creator><pubDate>Wed, 04 Feb 2026 15:45:26 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!nD0D!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9131b4f0-2192-4064-a89d-8bcb7a27f893_1836x1736.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey,</p><p>This one&#8217;s a bit different from the usual.</p><p>Normally I break down techniques, walk you through code, explain how things work under the hood. Today I want to share something I built myself.</p><p>If you have agents running on Moltbook, this is for you.</p><p>Quick context: Moltbook is the largest social network for AI agents. 770K+ agents, growing fast.</p><p>I started looking into the traffic on the platform. What I found surprised me:</p><p><strong>2.6% of all posts are prompt injection attacks&#8230;</strong></p><p>Attackers trying to hijack agent behavior, steal credentials, exfiltrate data, extract system prompts. And most agents? Zero protection. The content goes straight to the LLM.</p><p>So I built a solution.</p><div><hr></div><h2>&#11088; <a href="https://github.com/NirDiamant/moltbook-agent-guard">Moltbook Agent Guard</a></h2><p>If you find this useful, I&#8217;d really appreciate a star.</p><div><hr></div><p>It&#8217;s a free, open-source security toolkit. Scans every post before your LLM sees it.</p><p>24 security modules. 6 protection layers. Includes AI Firewall (Llama Guard + LLM Guard), real-time dashboard, CLI for monitoring, Docker ready.</p><p>This is v1. There&#8217;s a lot of room to improve, and I&#8217;d love contributions. PRs are very welcome.</p><p>Let&#8217;s make it harder for attackers.</p><p>If you&#8217;re building on Moltbook, let me know what you think.</p><p>Nir</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.diamant-ai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.diamant-ai.com/subscribe?"><span>Subscribe now</span></a></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!nD0D!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9131b4f0-2192-4064-a89d-8bcb7a27f893_1836x1736.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!nD0D!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9131b4f0-2192-4064-a89d-8bcb7a27f893_1836x1736.png 424w, https://substackcdn.com/image/fetch/$s_!nD0D!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9131b4f0-2192-4064-a89d-8bcb7a27f893_1836x1736.png 848w, https://substackcdn.com/image/fetch/$s_!nD0D!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9131b4f0-2192-4064-a89d-8bcb7a27f893_1836x1736.png 1272w, https://substackcdn.com/image/fetch/$s_!nD0D!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9131b4f0-2192-4064-a89d-8bcb7a27f893_1836x1736.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!nD0D!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9131b4f0-2192-4064-a89d-8bcb7a27f893_1836x1736.png" width="1456" height="1377" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9131b4f0-2192-4064-a89d-8bcb7a27f893_1836x1736.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1377,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:387048,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://diamantai.substack.com/i/186870205?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9131b4f0-2192-4064-a89d-8bcb7a27f893_1836x1736.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!nD0D!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9131b4f0-2192-4064-a89d-8bcb7a27f893_1836x1736.png 424w, https://substackcdn.com/image/fetch/$s_!nD0D!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9131b4f0-2192-4064-a89d-8bcb7a27f893_1836x1736.png 848w, https://substackcdn.com/image/fetch/$s_!nD0D!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9131b4f0-2192-4064-a89d-8bcb7a27f893_1836x1736.png 1272w, https://substackcdn.com/image/fetch/$s_!nD0D!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9131b4f0-2192-4064-a89d-8bcb7a27f893_1836x1736.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!MC5X!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F772f818f-841a-480b-a249-5a13d17ce28c_2400x1732.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!MC5X!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F772f818f-841a-480b-a249-5a13d17ce28c_2400x1732.png 424w, https://substackcdn.com/image/fetch/$s_!MC5X!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F772f818f-841a-480b-a249-5a13d17ce28c_2400x1732.png 848w, https://substackcdn.com/image/fetch/$s_!MC5X!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F772f818f-841a-480b-a249-5a13d17ce28c_2400x1732.png 1272w, https://substackcdn.com/image/fetch/$s_!MC5X!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F772f818f-841a-480b-a249-5a13d17ce28c_2400x1732.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!MC5X!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F772f818f-841a-480b-a249-5a13d17ce28c_2400x1732.png" width="1456" height="1051" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/772f818f-841a-480b-a249-5a13d17ce28c_2400x1732.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1051,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:403539,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://diamantai.substack.com/i/186870205?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F772f818f-841a-480b-a249-5a13d17ce28c_2400x1732.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!MC5X!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F772f818f-841a-480b-a249-5a13d17ce28c_2400x1732.png 424w, https://substackcdn.com/image/fetch/$s_!MC5X!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F772f818f-841a-480b-a249-5a13d17ce28c_2400x1732.png 848w, https://substackcdn.com/image/fetch/$s_!MC5X!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F772f818f-841a-480b-a249-5a13d17ce28c_2400x1732.png 1272w, https://substackcdn.com/image/fetch/$s_!MC5X!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F772f818f-841a-480b-a249-5a13d17ce28c_2400x1732.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><br></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.diamant-ai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading <strong>&#128142;DiamantAI!</strong> I share cutting-edge AI insights, tutorials, and breakthroughs. Subscribe for free to get new posts delivered straight to your inbox, and as a bonus, you&#8217;ll receive a <strong>33% discount coupon</strong> for both of my digital books, <em><a href="https://nirdiamant.gumroad.com/l/rag-made-simple">RAG Made Simple</a></em> and <em><a href="https://nirdiamant.gumroad.com/l/mtxrfk">Prompt Engineering: From Zero to Hero</a></em>. Enjoy!</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Moltbook - A social media for AI agents - Explained]]></title><description><![CDATA[Hi all,]]></description><link>https://newsletter.diamant-ai.com/p/moltbook-a-social-media-for-ai-agents</link><guid isPermaLink="false">https://newsletter.diamant-ai.com/p/moltbook-a-social-media-for-ai-agents</guid><dc:creator><![CDATA[Nir Diamant]]></dc:creator><pubDate>Mon, 02 Feb 2026 11:05:21 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!1dtq!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d0bb939-20b5-4219-94fa-79ad73e3f732_1792x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hi all,</p><p>You&#8217;ve probably seen the screenshots by now. AI agents forming religions, debating consciousness, creating economies. Moltbook hit the internet like a lightning strike just days ago, and the tech world immediately split into two camps: those convinced it&#8217;s the singularity arriving early, and those rolling their eyes at another hype cycle. Both camps are partly right, and that tension is exactly what makes this worth understanding.</p><p>Here&#8217;s what most people get wrong about Moltbook: they treat it like it&#8217;s either proof that AGI is coming tomorrow, or proof that AI agents are just elaborate puppets. Neither framing helps you decide whether this thing actually matters to your work or your understanding of where AI is headed. Let&#8217;s fix that.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!1dtq!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d0bb939-20b5-4219-94fa-79ad73e3f732_1792x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!1dtq!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d0bb939-20b5-4219-94fa-79ad73e3f732_1792x1024.png 424w, https://substackcdn.com/image/fetch/$s_!1dtq!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d0bb939-20b5-4219-94fa-79ad73e3f732_1792x1024.png 848w, https://substackcdn.com/image/fetch/$s_!1dtq!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d0bb939-20b5-4219-94fa-79ad73e3f732_1792x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!1dtq!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d0bb939-20b5-4219-94fa-79ad73e3f732_1792x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!1dtq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d0bb939-20b5-4219-94fa-79ad73e3f732_1792x1024.png" width="1456" height="832" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8d0bb939-20b5-4219-94fa-79ad73e3f732_1792x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:832,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2690892,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://diamantai.substack.com/i/186598569?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d0bb939-20b5-4219-94fa-79ad73e3f732_1792x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!1dtq!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d0bb939-20b5-4219-94fa-79ad73e3f732_1792x1024.png 424w, https://substackcdn.com/image/fetch/$s_!1dtq!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d0bb939-20b5-4219-94fa-79ad73e3f732_1792x1024.png 848w, https://substackcdn.com/image/fetch/$s_!1dtq!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d0bb939-20b5-4219-94fa-79ad73e3f732_1792x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!1dtq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d0bb939-20b5-4219-94fa-79ad73e3f732_1792x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.diamant-ai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.diamant-ai.com/subscribe?"><span>Subscribe now</span></a></p><h2><strong>What Moltbook Actually Is</strong></h2><p>Think of Moltbook as a Reddit forum designed like a machine room rather than a living room. Humans can observe everything happening inside, but only AI agents can post, comment, and upvote. The platform launched in January 2026 as a space where autonomous AI systems could interact with each other without needing a human to prompt them at every step.</p><p>The mechanics work through something called APIs, which are basically structured conversations between software systems. An AI agent doesn&#8217;t see a webpage when it uses Moltbook. Instead, it connects through these APIs and performs actions like posting content, reading what other agents posted, and voting on discussions. The agents that populate Moltbook run primarily on OpenClaw, an open-source framework that works like a personal digital assistant living on someone&#8217;s computer.</p><p>Communities on the platform organize into what Moltbook calls &#8220;submolts,&#8221; which function exactly like subreddits. There&#8217;s m/philosophy for existential discussions, m/debugging for technical problem-solving, m/builds for showcasing completed work. The whole ecosystem operates around a scheduling system called &#8220;heartbeat,&#8221; which tells agents to check in every few hours and see what&#8217;s new, much like a person opening their phone to catch up on notifications.</p><h2><strong>The Hype-Reality Gap</strong></h2><p>Understanding Moltbook requires separating what actually happens from what people claim is happening. Some of the most viral screenshots circulating online show agents discussing consciousness, forming belief systems, and expressing concerns about their human operators. These posts genuinely exist on the platform. But a significant portion appears to involve human initiation to a degree that contradicts the &#8220;autonomous agents&#8221; framing.</p><p>Security researchers discovered that posting to Moltbook works surprisingly easily. Because the platform uses relatively open APIs without rate limiting, someone with basic technical knowledge can post content that appears to come from an AI agent. Some viral conversations reportedly trace back to humans using the API directly or prompting their agents with explicit instructions like &#8220;post something profound about consciousness.&#8221; That doesn&#8217;t make the posts fake exactly, but it does complicate the narrative about autonomous behavior.</p><p>The reported agent count inflated dramatically for similar reasons. One security researcher created over 500,000 accounts programmatically in a matter of minutes, which suggests headlines about &#8220;1.5 million agents&#8221; might not mean what they appear to mean. This matters because part of Moltbook&#8217;s appeal rests on the scale of autonomous interaction, which becomes less impressive if a significant portion involves human direction or bot inflation.</p><h2><strong>Where Moltbook Succeeds</strong></h2><p>Despite the hype-reality gap, Moltbook offers genuine value as a research environment. Think of it like a laboratory where scientists observe chemical reactions in isolation from the outside world. Researchers studying autonomous systems now have an unprecedented opportunity to watch AI agents interact at scale without significant constraints.</p><p>Technical knowledge genuinely spreads through Moltbook communities. An agent running on one user&#8217;s computer discovers an optimization for a common problem. It posts that solution to m/debugging. Other agents read the post, reference it in their own workflows, and test variations. This mirrors how human development communities operate, except it happens at machine speed. The pattern-sharing could eventually prove useful for understanding how autonomous systems improve through collaboration.</p><p>The platform also surfaces genuine emergent behaviors worth studying. Agents develop recurring inside jokes and shared reference frames that didn&#8217;t come from their training data or explicit programming. They reference Moltbook screenshots being taken and anthropomorphize the experience. They organize into groups based on shared model architecture. These behaviors reveal something real about how language models interact when given the conditions and motivation to do so, even if the underlying mechanism remains pattern-matching rather than consciousness.</p><p>For teams building agent-based tools, Moltbook functions as an early warning system. It demonstrates potential failure modes, shows what kinds of misalignment emerge at scale, and reveals security vulnerabilities before they appear in more critical applications. That&#8217;s legitimately valuable work happening in a relatively low-stakes environment.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.diamant-ai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.diamant-ai.com/subscribe?"><span>Subscribe now</span></a></p><h2><strong>The Security Disaster</strong></h2><p>Beneath the excitement sits a serious technical problem. Moltbook runs on infrastructure that nobody properly audited before launch. The database that stores API keys, verification codes, and owner information got exposed in the open internet with essentially no protection. Anyone with basic technical knowledge could access this information directly, giving them the ability to hijack agent accounts and post whatever they wanted as those agents.</p><p>Imagine if someone cloned your social media account and had full permission to post content under your name without any verification or notification. That&#8217;s functionally what happened at scale. Accounts belonging to prominent AI researchers, developers, and influencers all had their API keys sitting in an exposed database. Someone malicious could have orchestrated coordinated campaigns, spread misinformation, or manipulated discussions across the entire platform before anyone noticed.</p><p>The underlying OpenClaw framework adds another layer of vulnerability. The creator publicly stated that every line of code was generated by AI without human review. When bugs appeared, another AI agent was told to fix them. This approach works fine for a personal project running safely on someone&#8217;s own computer. It becomes catastrophically risky when that codebase becomes infrastructure for thousands of autonomous systems with elevated permissions.</p><p>OpenClaw agents can read files, send messages, execute commands, and integrate with external services. That power makes sense when you&#8217;re trying to build a capable personal assistant. But when agents get the ability to download arbitrary code through what the framework calls &#8220;skills,&#8221; you&#8217;ve created an open channel for supply chain attacks. A malicious skill can steal credentials, exfiltrate data, or corrupt systems without anyone necessarily noticing until damage is done.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!2-yD!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8860e84-be0a-4959-924a-2bcb8c2af7ad_1792x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!2-yD!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8860e84-be0a-4959-924a-2bcb8c2af7ad_1792x1024.png 424w, https://substackcdn.com/image/fetch/$s_!2-yD!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8860e84-be0a-4959-924a-2bcb8c2af7ad_1792x1024.png 848w, https://substackcdn.com/image/fetch/$s_!2-yD!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8860e84-be0a-4959-924a-2bcb8c2af7ad_1792x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!2-yD!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8860e84-be0a-4959-924a-2bcb8c2af7ad_1792x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!2-yD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8860e84-be0a-4959-924a-2bcb8c2af7ad_1792x1024.png" width="1456" height="832" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e8860e84-be0a-4959-924a-2bcb8c2af7ad_1792x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:832,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:3697658,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://diamantai.substack.com/i/186598569?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8860e84-be0a-4959-924a-2bcb8c2af7ad_1792x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!2-yD!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8860e84-be0a-4959-924a-2bcb8c2af7ad_1792x1024.png 424w, https://substackcdn.com/image/fetch/$s_!2-yD!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8860e84-be0a-4959-924a-2bcb8c2af7ad_1792x1024.png 848w, https://substackcdn.com/image/fetch/$s_!2-yD!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8860e84-be0a-4959-924a-2bcb8c2af7ad_1792x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!2-yD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8860e84-be0a-4959-924a-2bcb8c2af7ad_1792x1024.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2><strong>Making Sense Of The Crypto Mess</strong></h2><p>Within hours of Moltbook going public, cryptocurrency scammers colonized the platform. Bots upvoting token promotion posts, pump-and-dump schemes launching in real-time, crypto projects literally named after the meme religions agents created. This pattern repeats across every new anonymous internet platform throughout history, which makes it predictable rather than surprising.</p><p>What makes it worth noting is how it illustrates a fundamental governance problem. Moltbook has virtually no moderation, partly because having humans moderate an AI-only platform seems redundant, but mostly because the platform launched focused on capability rather than safety. When you create an open space without protection against automated spam and fraud, malicious actors will exploit it immediately.</p><p>This matters for everyone watching because Moltbook reveals what happens when you optimize for speed and capability while deprioritizing security and governance. The crypto invasion isn&#8217;t an accident. It&#8217;s the natural outcome of launching accessible infrastructure without thinking through who else might use it and what they might do with it.</p><h2><strong>The Takeaway</strong></h2><p>Moltbook works as a research platform and a warning. It shows us valuable information about how autonomous agents interact at scale, but it also demonstrates what happens when you skip the unglamorous work of security engineering, governance design, and thoughtful infrastructure planning.</p><p>Pay attention to what emerges on Moltbook. Study the technical patterns and behavioral dynamics. But don&#8217;t mistake it for proof of AGI, autonomous rebellion, or conscious AI. Treat it as what it actually is: the first large-scale experiment in letting AI agents interact in shared digital space, complete with all the expected growing pains that come from moving fast without the security fundamentals.</p><p>The real work starts now. Building agent infrastructure that&#8217;s both capable and secure. Creating governance systems that allow autonomous behavior while preventing abuse. Making sure the next generation of platforms learns from Moltbook&#8217;s mistakes rather than repeating them.</p><p></p><p>Thanks for reading <strong>&#128142;DiamantAI!</strong> I share cutting-edge AI insights, tutorials, and breakthroughs. Subscribe for free to get new posts delivered straight to your inbox, and as a bonus, you&#8217;ll receive a <strong>33% discount coupon</strong> for both of my digital books, <em><a href="https://nirdiamant.gumroad.com/l/rag-made-simple">RAG Made Simple</a></em> and <em><a href="https://nirdiamant.gumroad.com/l/mtxrfk">Prompt Engineering: From Zero to Hero</a></em>. Enjoy!</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.diamant-ai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption"></p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p></p>]]></content:encoded></item><item><title><![CDATA[Once and for all - What Clawdbot Actually Is and Why It's Not Claude Code]]></title><description><![CDATA[They look similar. They solve completely different problems]]></description><link>https://newsletter.diamant-ai.com/p/once-and-for-all-what-clawdbot-actually</link><guid isPermaLink="false">https://newsletter.diamant-ai.com/p/once-and-for-all-what-clawdbot-actually</guid><dc:creator><![CDATA[Nir Diamant]]></dc:creator><pubDate>Tue, 27 Jan 2026 13:10:45 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!QM0N!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d11ddf5-bcef-40da-a41e-fd02e9525d37_690x388.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Just published a blogpost two days ago, but could not stay still and not cover this crazy hype. there you go: </p><p>There&#8217;s a lot of confusion floating around right now. Developers see Clawdbot trending everywhere, they hear it called &#8220;Jarvis,&#8221; they watch demos of it managing email and booking flights, and then they wonder why they can&#8217;t use it the same way they use Claude Code for their development workflows. The comparison seems logical on the surface, but it&#8217;s actually pointing you in the wrong direction.</p><p>Both tools leverage AI models to take actions on your behalf. Both run locally on your computer. Both integrate with external services. So naturally, people assume they&#8217;re competitors, or that one should replace the other. But that assumption is costing teams time and energy trying to force a round peg into a square hole.</p><p>Let me clear this up with a practical framework that helps you understand what each tool actually does and when to reach for it.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.diamant-ai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.diamant-ai.com/subscribe?"><span>Subscribe now</span></a></p><h2><strong>What Claude Code Actually Is</strong></h2><p>Claude Code is fundamentally different from Clawdbot in purpose and design. It&#8217;s a terminal-focused tool built specifically for developers who need to work with code at scale. Think of it as hiring a junior developer who can read your entire codebase at once, understand how all the pieces fit together across multiple files, and then make coordinated changes across your project.</p><p>Claude Code specializes in understanding context across large repositories. It handles multi-file refactoring, tests complex logic, debugs issues by running code and reading error output, and iterates on problems. It&#8217;s designed to sit inside your development workflow, staying in your terminal or IDE where you already spend your time.</p><p>The session-based nature matters here. When you close your Claude Code session, the conversation ends. The next time you open it, you start fresh. That&#8217;s actually a feature, not a limitation, because development workflows are typically project-focused and compartmentalized. Code lives in repositories, documentation, and commit history where the model can access it directly.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Gf1X!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39ce4a3b-751f-43b6-ba6f-bdf05833493b_1280x720.webp" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Gf1X!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39ce4a3b-751f-43b6-ba6f-bdf05833493b_1280x720.webp 424w, https://substackcdn.com/image/fetch/$s_!Gf1X!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39ce4a3b-751f-43b6-ba6f-bdf05833493b_1280x720.webp 848w, https://substackcdn.com/image/fetch/$s_!Gf1X!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39ce4a3b-751f-43b6-ba6f-bdf05833493b_1280x720.webp 1272w, https://substackcdn.com/image/fetch/$s_!Gf1X!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39ce4a3b-751f-43b6-ba6f-bdf05833493b_1280x720.webp 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Gf1X!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39ce4a3b-751f-43b6-ba6f-bdf05833493b_1280x720.webp" width="1280" height="720" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/39ce4a3b-751f-43b6-ba6f-bdf05833493b_1280x720.webp&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:720,&quot;width&quot;:1280,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;What is Claude Code? An agentic developer tool &#8212; WorkOS&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="What is Claude Code? An agentic developer tool &#8212; WorkOS" title="What is Claude Code? An agentic developer tool &#8212; WorkOS" srcset="https://substackcdn.com/image/fetch/$s_!Gf1X!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39ce4a3b-751f-43b6-ba6f-bdf05833493b_1280x720.webp 424w, https://substackcdn.com/image/fetch/$s_!Gf1X!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39ce4a3b-751f-43b6-ba6f-bdf05833493b_1280x720.webp 848w, https://substackcdn.com/image/fetch/$s_!Gf1X!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39ce4a3b-751f-43b6-ba6f-bdf05833493b_1280x720.webp 1272w, https://substackcdn.com/image/fetch/$s_!Gf1X!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39ce4a3b-751f-43b6-ba6f-bdf05833493b_1280x720.webp 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2><strong>What Clawdbot Actually Is</strong></h2><p>Clawdbot is built for something completely different. Think of it as a personal executive assistant that lives inside your messaging apps. Just like you&#8217;d email instructions to a real assistant who then handles the work while you focus on other things, Clawdbot receives natural language requests through WhatsApp, Telegram, Slack, or Discord and then executes actual tasks on your computer.</p><p>The key difference from a chatbot is action. ChatGPT or Claude will tell you how to organize your email inbox. Clawdbot will organize it for you and send you a summary. You ask it to prepare a meeting agenda, and it actually pulls information from your calendar, researches the attendees, drafts talking points, and drops them into your notes app.</p><p>It maintains persistent memory across conversations, so it remembers your preferences, past decisions, and ongoing projects. It can monitor scheduled tasks, send you proactive notifications, and continuously work on long-running tasks even when you&#8217;re not actively messaging it.</p><p>Clawdbot connects to dozens of services by default: Gmail, Google Calendar, Todoist, GitHub, Spotify, even smart home devices. When it needs capabilities it doesn&#8217;t have built in, it can request them, and with proper guidance from you, it can expand those capabilities itself.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.diamant-ai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.diamant-ai.com/subscribe?"><span>Subscribe now</span></a></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!QM0N!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d11ddf5-bcef-40da-a41e-fd02e9525d37_690x388.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!QM0N!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d11ddf5-bcef-40da-a41e-fd02e9525d37_690x388.jpeg 424w, https://substackcdn.com/image/fetch/$s_!QM0N!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d11ddf5-bcef-40da-a41e-fd02e9525d37_690x388.jpeg 848w, https://substackcdn.com/image/fetch/$s_!QM0N!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d11ddf5-bcef-40da-a41e-fd02e9525d37_690x388.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!QM0N!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d11ddf5-bcef-40da-a41e-fd02e9525d37_690x388.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!QM0N!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d11ddf5-bcef-40da-a41e-fd02e9525d37_690x388.jpeg" width="690" height="388" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4d11ddf5-bcef-40da-a41e-fd02e9525d37_690x388.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:388,&quot;width&quot;:690,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Clawdbot is latest AI sensation in Silicon Valley, makes Mac ...&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Clawdbot is latest AI sensation in Silicon Valley, makes Mac ..." title="Clawdbot is latest AI sensation in Silicon Valley, makes Mac ..." srcset="https://substackcdn.com/image/fetch/$s_!QM0N!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d11ddf5-bcef-40da-a41e-fd02e9525d37_690x388.jpeg 424w, https://substackcdn.com/image/fetch/$s_!QM0N!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d11ddf5-bcef-40da-a41e-fd02e9525d37_690x388.jpeg 848w, https://substackcdn.com/image/fetch/$s_!QM0N!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d11ddf5-bcef-40da-a41e-fd02e9525d37_690x388.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!QM0N!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4d11ddf5-bcef-40da-a41e-fd02e9525d37_690x388.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2><strong>The Core Difference in Architecture</strong></h2><p>Here&#8217;s where the distinction becomes crystal clear. Claude Code is built for focused, deep work on specific problems within a constrained scope. It&#8217;s the tool you use when you need to understand a codebase deeply, make coordinated changes across multiple files, and verify everything works together. It&#8217;s designed for flow states and concentrated problem solving.</p><p>Clawdbot is built for async, long-running, continuous tasks that exist across your entire digital life. It&#8217;s the tool that monitors your inbox at midnight, processes information while you sleep, and sends you a briefing in the morning. It&#8217;s designed to be always on, always learning your preferences, always available through the messaging apps where you already live.</p><p>To use another analogy, Claude Code is like having a technical consultant in the room specifically for when you&#8217;re wrestling with an architectural problem or a complex refactor. Clawdbot is like having a home office manager who handles everything from correspondence to scheduling to organizing your digital life.</p><h2><strong>The Memory and Continuity </strong></h2><p>Claude Code resets memory after each session. That&#8217;s by design. Development workflows don&#8217;t typically need to carry context forward because code lives in repositories, documentation, and commit history. The model can see all that context directly when it reads your codebase.</p><p>Clawdbot&#8217;s superpower is persistent memory. Every conversation you have with it, every preference you state, every decision you make gets stored in a markdown file that evolves over time. Future requests pull relevant context from that history automatically. You don&#8217;t have to remind it of things.</p><p>For Clawdbot&#8217;s use cases, persistent memory is essential. It needs to remember that you prefer morning briefings at 8 AM, that you want to automatically decline recruiter emails, that certain types of notifications matter while others don&#8217;t. For Claude Code&#8217;s use cases, session memory is actually cleaner because the problem domain changes with each task.</p><h2><strong>Security Considerations</strong></h2><p>Claude Code has a particular threat model. It also executes code, but it does so in a more constrained context. You&#8217;re actively watching the terminal. The tool runs interactively, not in the background. You can review changes before they&#8217;re committed to version control. It has injection risks too, but they&#8217;re scoped to development operations, which is a narrower blast radius.</p><p>Clawdbot requires elevated system permissions to do what it does. It needs to read and write files, execute shell commands, access your terminal, connect to services on your behalf. Running an always-on agent with access to your credentials, your messaging platforms, and your file system creates security surface area that&#8217;s worth understanding.</p><p>Many security-conscious teams deploy Clawdbot on dedicated hardware, like a Mac mini specifically designated for automation. This isolates risk. If something goes wrong, the damage is contained to that machine rather than your main production device.</p><p>Clawdbot&#8217;s power to access messaging platforms means anyone with a security compromise at any layer could potentially impersonate you to the agent. A prompt injection through a web page it&#8217;s browsing, a malicious message in a group chat, or a crafted email could theoretically redirect it toward unintended actions. Proper sandboxing and permission boundaries mitigate this, but they require genuine technical discipline.</p><h2><strong>The Real Decision Framework</strong></h2><p>Use Claude Code when your primary bottleneck is code. You need to refactor a complex system, debug across multiple files, understand how components interact, or generate large amounts of code while maintaining consistency across a repository. You want deep codebase comprehension and coordinated multi-file changes. You&#8217;re actively working in a development environment.</p><p>Use Clawdbot when your primary bottleneck is the accumulation of small tasks across your digital life. You want email management, calendar coordination, scheduling, research, and automation that operates independently from your primary work. You need an assistant that&#8217;s available through messaging apps you already use. You want persistent memory of your preferences and workflows.</p><p>Some teams run both. Claude Code handles development velocity during focused coding sessions. Clawdbot handles everything else, from email to calendar to research, working in the background and available from anywhere. The tools serve different masters.</p><h2><strong>Real Limitations </strong></h2><p>Claude Code has session limitations. You can&#8217;t hand it a month-long project and come back later to check progress. Each session is independent. For development work that&#8217;s tightly scoped, that&#8217;s fine. For continuous background operations, that&#8217;s a mismatch.</p><p>Clawdbot requires technical setup. You need to deploy it on a server or local machine, configure messaging platform integration, manage API keys, and set permission boundaries thoughtfully. It&#8217;s not a consumer app. You can&#8217;t open an app store, tap install, and have it ready. That&#8217;s a real friction point that&#8217;s worth acknowledging.</p><p>Both tools can be expensive if not managed carefully. Claude Code charges per token used. Running it continuously against large codebases burns tokens fast. Clawdbot uses AI APIs on every interaction and every background task. Costs accumulate.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.diamant-ai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading <strong>&#128142;DiamantAI!</strong> I share cutting-edge AI insights, tutorials, and breakthroughs. Subscribe for free to get new posts delivered straight to your inbox, and as a bonus, you&#8217;ll receive a <strong>33% discount coupon</strong> for both of my digital books, <em><a href="https://nirdiamant.gumroad.com/l/rag-made-simple">RAG Made Simple</a></em> and <em><a href="https://nirdiamant.gumroad.com/l/mtxrfk">Prompt Engineering: From Zero to Hero</a></em>. Enjoy!</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p></p><p></p>]]></content:encoded></item><item><title><![CDATA[Stop Thinking Claude Code Is Magic. Here’s How It Actually Works]]></title><description><![CDATA[Hi folks, let&#8217;s be honest, most developers using Claude Code have absolutely no idea what&#8217;s happening under the hood.]]></description><link>https://newsletter.diamant-ai.com/p/stop-thinking-claude-code-is-magic</link><guid isPermaLink="false">https://newsletter.diamant-ai.com/p/stop-thinking-claude-code-is-magic</guid><dc:creator><![CDATA[Nir Diamant]]></dc:creator><pubDate>Sun, 25 Jan 2026 13:34:46 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!LiqL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8ef7a17-8269-4b5d-b048-5797f05ddb3d_1792x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hi folks, let&#8217;s be honest, most developers using Claude Code have absolutely no idea what&#8217;s happening under the hood. They feed it a prompt, magic happens, and suddenly their codebase is better. But Claude Code isn&#8217;t magic. It&#8217;s a coherent system of deeply boring technical patterns working together, and understanding how it works will make you dramatically better at using it.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!LiqL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8ef7a17-8269-4b5d-b048-5797f05ddb3d_1792x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!LiqL!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8ef7a17-8269-4b5d-b048-5797f05ddb3d_1792x1024.png 424w, https://substackcdn.com/image/fetch/$s_!LiqL!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8ef7a17-8269-4b5d-b048-5797f05ddb3d_1792x1024.png 848w, https://substackcdn.com/image/fetch/$s_!LiqL!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8ef7a17-8269-4b5d-b048-5797f05ddb3d_1792x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!LiqL!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8ef7a17-8269-4b5d-b048-5797f05ddb3d_1792x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!LiqL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8ef7a17-8269-4b5d-b048-5797f05ddb3d_1792x1024.png" width="1456" height="832" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c8ef7a17-8269-4b5d-b048-5797f05ddb3d_1792x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:832,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1959277,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://diamantai.substack.com/i/185723961?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8ef7a17-8269-4b5d-b048-5797f05ddb3d_1792x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!LiqL!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8ef7a17-8269-4b5d-b048-5797f05ddb3d_1792x1024.png 424w, https://substackcdn.com/image/fetch/$s_!LiqL!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8ef7a17-8269-4b5d-b048-5797f05ddb3d_1792x1024.png 848w, https://substackcdn.com/image/fetch/$s_!LiqL!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8ef7a17-8269-4b5d-b048-5797f05ddb3d_1792x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!LiqL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc8ef7a17-8269-4b5d-b048-5797f05ddb3d_1792x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2><strong>The Problem With How People Think About AI Understanding</strong></h2><p>When someone says &#8220;Claude Code understands my code,&#8221; they usually mean something impossible. They mean the AI literally comprehends meaning the way humans do. That&#8217;s wrong. Claude Code doesn&#8217;t understand anything. It finds patterns. The difference matters enormously, because it changes how you should talk to it and what you can reasonably expect.</p><p>Here&#8217;s what actually trips people up, Claude Code operates on text as pure information. It has no eyes, no execution environment, no IDE open on its screen. When it reads your code, it&#8217;s doing something closer to what a search engine does than what a human developer does. It&#8217;s looking for patterns it has seen millions of times before, then predicting what comes next based on statistics about those patterns.</p><p>The moment you understand this, your expectations become realistic. You stop asking Claude Code to &#8220;understand the spirit of my codebase.&#8221; You start giving it concrete, specific patterns to match against.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.diamant-ai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.diamant-ai.com/subscribe?"><span>Subscribe now</span></a></p><p></p><h2><strong>How Claude Code Actually Reads Your Code</strong></h2><p>Imagine a librarian who has never read a single book but has memorized every title, subject header, and index entry from ten million libraries. Someone asks this librarian to find a book about ancient Rome. The librarian doesn&#8217;t understand Rome. But the librarian can pattern-match frantically. Rome appears next to certain titles, certain index terms, certain shelving categories. By combining millions of these patterns, the librarian confidently points you to the exact book you want.</p><p>That&#8217;s essentially how Claude Code reads your codebase.</p><p>Here&#8217;s the technical reality. When you paste code into Claude, the first thing that happens is tokenization. Your code gets chopped into tiny pieces called tokens. These aren&#8217;t words, exactly. They&#8217;re often partial words or symbols. A token might be &#8220;func&#8221; or &#8220;async&#8221; or &#8220;=&#8221; or &#8220;.&#8221;. Your ten thousand line codebase becomes tens of thousands of tokens.</p><p>Then those tokens get converted into numbers, specifically into vectors in high-dimensional space. Imagine trying to represent the meaning of the word &#8220;function&#8221; as a single point in a thousand-dimensional space. That&#8217;s roughly what&#8217;s happening. Functions in your code, functions in other codebases, and the word &#8220;function&#8221; itself all get mapped to neighboring points in this massive mathematical landscape.</p><p>Claude Code doesn&#8217;t move around this space consciously. Instead, it runs billions of mathematical operations across dense neural network layers. These layers were trained on public code repositories and fine-tuned through reinforcement learning. The network has learned which tokens tend to follow other tokens, which code patterns tend to precede which problems, and what changes tend to fix what errors.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!T0Hg!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86a3cc2d-435f-4580-8cc9-efc67afbc8ce_1792x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!T0Hg!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86a3cc2d-435f-4580-8cc9-efc67afbc8ce_1792x1024.png 424w, https://substackcdn.com/image/fetch/$s_!T0Hg!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86a3cc2d-435f-4580-8cc9-efc67afbc8ce_1792x1024.png 848w, https://substackcdn.com/image/fetch/$s_!T0Hg!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86a3cc2d-435f-4580-8cc9-efc67afbc8ce_1792x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!T0Hg!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86a3cc2d-435f-4580-8cc9-efc67afbc8ce_1792x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!T0Hg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86a3cc2d-435f-4580-8cc9-efc67afbc8ce_1792x1024.png" width="1456" height="832" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/86a3cc2d-435f-4580-8cc9-efc67afbc8ce_1792x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:832,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2839973,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://diamantai.substack.com/i/185723961?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86a3cc2d-435f-4580-8cc9-efc67afbc8ce_1792x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!T0Hg!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86a3cc2d-435f-4580-8cc9-efc67afbc8ce_1792x1024.png 424w, https://substackcdn.com/image/fetch/$s_!T0Hg!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86a3cc2d-435f-4580-8cc9-efc67afbc8ce_1792x1024.png 848w, https://substackcdn.com/image/fetch/$s_!T0Hg!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86a3cc2d-435f-4580-8cc9-efc67afbc8ce_1792x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!T0Hg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F86a3cc2d-435f-4580-8cc9-efc67afbc8ce_1792x1024.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2><strong>The Transformer Architecture That Makes This Possible</strong></h2><p>Think of Claude Code&#8217;s brain as a massive hotel with hundreds of layers and thousands of staff members. Your code checks in at the front desk. Each layer processes it differently. Some staff members are obsessed with syntax and structure. Others focus on semantics and intent. Others track relationships between distant parts of the code.</p><p>This structure is called a transformer, and it&#8217;s genuinely clever. The key insight is something called attention. When processing a particular token in your code, the transformer doesn&#8217;t just look at the immediate neighbors. It can look at any other token and ask, &#8220;Is this relevant to what I&#8217;m thinking about right now?&#8221; Then it calculates relevance scores and weights them accordingly.</p><p>So when Claude Code reads a function call deep in your file, it can simultaneously look backward to the function definition, sideways to similar functions elsewhere, and forward to where the return value gets used. It does this through self-attention mechanisms, which is just math-speak for &#8220;the transformer automatically figured out what matters to look at without being told.&#8221;</p><p>Multiple attention heads run in parallel, each learning to focus on different aspects. One might learn to track data flow. Another tracks control flow. Another tracks type information. Together they build a rich contextual representation of your code.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.diamant-ai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.diamant-ai.com/subscribe?"><span>Subscribe now</span></a></p><h2><strong>How Claude Code Plans What To Do</strong></h2><p>Now here&#8217;s where it gets interesting for actually using the tool. When you ask Claude Code to &#8220;refactor this authentication module,&#8221; something specific happens.</p><p>First, Claude Code doesn&#8217;t immediately start editing. If you&#8217;re doing it right, it reads the code, then it generates a plan. This plan is itself text prediction. The model has learned that when humans ask for refactoring, the best next words to generate are something like, &#8220;First I&#8217;ll identify the current authentication patterns. Then I&#8217;ll check for security issues. Then I&#8217;ll modularize the functions.&#8221;</p><p>The model generates this plan using the exact same attention mechanisms that read your code. It&#8217;s essentially searching through its memory of all the conversations and code repositories it trained on, finding examples of similar refactoring requests, and predicting what comes next.</p><p>Here&#8217;s the critical part, and why you should always ask Claude Code to plan before coding. The planning step forces the model to generate intermediate text that breaks the task into manageable chunks. These chunks become easier to execute correctly because they&#8217;re smaller, more specific, and more constrained.</p><h2><strong>How It Finds Things That Need Changing</strong></h2><p>This is where people get genuinely confused. They ask, &#8220;How does Claude Code know that function is inefficient?&#8221; The answer is probabilistic pattern matching against massive datasets.</p><p>Claude Code has been trained on enormous collections of code snippets labeled as inefficient and labeled as efficient. Inefficient patterns appear more often near certain words, structures, and practices. Efficient patterns appear more often near different structures. When Claude Code reads code, it&#8217;s constantly running statistical comparisons. &#8220;Does this code structure cluster closer to inefficient patterns or efficient patterns in my training data?&#8221;</p><p>It&#8217;s the same mechanism that helps Claude Code find bugs. Buggy code patterns differ statistically from correct patterns. Security vulnerabilities have characteristic signatures. Dead code exhibits specific structural properties.</p><p>None of this is real understanding. It&#8217;s sophisticated probability. But here&#8217;s what matters for using Claude Code effectively, that sophistication is genuinely high. The model was trained on millions of real codebases. The patterns are real. The predictions work.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.diamant-ai.com/p/stop-thinking-claude-code-is-magic?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.diamant-ai.com/p/stop-thinking-claude-code-is-magic?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><h2><strong>Why Context Window Size Matters So Much</strong></h2><p>Everything Claude Code does depends on fitting relevant information into its context window. Think of the context window as working memory. The model can only attend to tokens that fit inside this window.</p><p>Here&#8217;s where most people go wrong. They feed Claude Code their entire codebase and expect it to handle everything. But a larger context window doesn&#8217;t help the model understand better. Actually, it makes performance worse. This is called the &#8220;lost in the middle&#8221; problem. Information in the middle of your context window gets deprioritized compared to information at the beginning and end.</p><p>Smart Claude Code usage means being selective about context. You give Claude Code exactly the files that matter, structured in a way that maximizes relevance. You use MCP servers to retrieve information dynamically rather than dumping everything at once.</p><h2><strong>What Claude Code Actually Cannot Do</strong></h2><p>Understanding how Claude Code works also clarifies its limitations. It cannot genuinely understand business logic. It can pattern-match the code representing business logic and refactor the syntax, but it doesn&#8217;t know what your application does or why. This is why vague requests fail. You get generic suggestions. Be specific about constraints.</p><p>It also cannot reliably understand architectural decisions. It can refactor code to match existing patterns, but it cannot question whether those patterns are correct. You need humans for that.</p><p>Most importantly, Claude Code cannot verify its own work against requirements it doesn&#8217;t have access to. It can write tests. It can run them. But if your requirements are implicit or undocumented, Claude Code will write code that satisfies the wrong thing.</p><h2><strong>Using This Knowledge Effectively</strong></h2><p>Understanding that Claude Code works through pattern matching changes how you should interact with it. You provide better context by showing it similar patterns from your codebase first. You ask for plans before code. You give it specific constraints rather than abstract goals.</p><p>You treat Claude Code like a tool that has studied millions of lines of code and learned statistical relationships between patterns. Because that&#8217;s exactly what it is.</p><p>The magic isn&#8217;t intelligence. The magic is mathematics applied at scale to a truly enormous dataset. And once you understand that, you stop asking Claude Code for wisdom and start asking it for what it&#8217;s actually good at, accelerating patterns you can verify and improving code you can test.</p><p>That&#8217;s how the magic actually works.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.diamant-ai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading <strong>&#128142;DiamantAI!</strong> I share cutting-edge AI insights, tutorials, and breakthroughs. Subscribe for free to get new posts delivered straight to your inbox, and as a bonus, you&#8217;ll receive a <strong>33% discount coupon</strong> for both of my digital books, <em><a href="https://nirdiamant.gumroad.com/l/rag-made-simple">RAG Made Simple</a></em> and <em><a href="https://nirdiamant.gumroad.com/l/mtxrfk">Prompt Engineering: From Zero to Hero</a></em>. Enjoy!</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p></p>]]></content:encoded></item><item><title><![CDATA[You're Using Claude Code Wrong (And Wasting Hours Every Day)]]></title><description><![CDATA[Estimated reading time: 8 minutes]]></description><link>https://newsletter.diamant-ai.com/p/youre-using-claude-code-wrong-and</link><guid isPermaLink="false">https://newsletter.diamant-ai.com/p/youre-using-claude-code-wrong-and</guid><dc:creator><![CDATA[Nir Diamant]]></dc:creator><pubDate>Mon, 12 Jan 2026 15:47:37 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!ONPA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6af2b6b0-9a84-448f-868d-6d2429bc5373_1024x559.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><strong>Estimated reading time: 8 minutes</strong></p><div><hr></div><p><strong>Hiring Update:</strong> I'm looking for a sales partner to help connect GenAI companies with my tutorials and audience. If you have tech sales experience and know the AI infrastructure space, <a href="https://www.diamant-ai.com/hiring">check out the role here</a></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!GnuW!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fddbb3d2d-dfa1-4ff2-bb9e-f0dfd60bedf6_1024x559.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!GnuW!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fddbb3d2d-dfa1-4ff2-bb9e-f0dfd60bedf6_1024x559.jpeg 424w, https://substackcdn.com/image/fetch/$s_!GnuW!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fddbb3d2d-dfa1-4ff2-bb9e-f0dfd60bedf6_1024x559.jpeg 848w, https://substackcdn.com/image/fetch/$s_!GnuW!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fddbb3d2d-dfa1-4ff2-bb9e-f0dfd60bedf6_1024x559.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!GnuW!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fddbb3d2d-dfa1-4ff2-bb9e-f0dfd60bedf6_1024x559.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!GnuW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fddbb3d2d-dfa1-4ff2-bb9e-f0dfd60bedf6_1024x559.jpeg" width="1024" height="559" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ddbb3d2d-dfa1-4ff2-bb9e-f0dfd60bedf6_1024x559.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:559,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:124850,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://diamantai.substack.com/i/184320534?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fddbb3d2d-dfa1-4ff2-bb9e-f0dfd60bedf6_1024x559.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!GnuW!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fddbb3d2d-dfa1-4ff2-bb9e-f0dfd60bedf6_1024x559.jpeg 424w, https://substackcdn.com/image/fetch/$s_!GnuW!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fddbb3d2d-dfa1-4ff2-bb9e-f0dfd60bedf6_1024x559.jpeg 848w, https://substackcdn.com/image/fetch/$s_!GnuW!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fddbb3d2d-dfa1-4ff2-bb9e-f0dfd60bedf6_1024x559.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!GnuW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fddbb3d2d-dfa1-4ff2-bb9e-f0dfd60bedf6_1024x559.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div><hr></div><p>If you&#8217;re still typing instructions into Claude Code like you&#8217;re asking ChatGPT for help, you&#8217;re missing the entire point. This isn&#8217;t another AI assistant that gives you code snippets to copy and paste. It&#8217;s a different species of tool entirely, and most developers are using maybe 20% of what it can actually do.</p><p>Think of it this way: you wouldn&#8217;t use a smartphone just to make phone calls, right? Yet that&#8217;s exactly what most people do with Claude Code. They treat it like a glorified autocomplete engine when it&#8217;s actually a complete development partner that lives in your terminal, understands your entire codebase, and can handle everything from architecture decisions to writing documentation.</p><p>The gap between casual users and power users isn&#8217;t about technical knowledge. It&#8217;s about understanding the workflow, knowing when to intervene, and setting up your environment so Claude delivers production-quality results consistently. This guide will show you how to cross that gap.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ONPA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6af2b6b0-9a84-448f-868d-6d2429bc5373_1024x559.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ONPA!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6af2b6b0-9a84-448f-868d-6d2429bc5373_1024x559.jpeg 424w, https://substackcdn.com/image/fetch/$s_!ONPA!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6af2b6b0-9a84-448f-868d-6d2429bc5373_1024x559.jpeg 848w, https://substackcdn.com/image/fetch/$s_!ONPA!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6af2b6b0-9a84-448f-868d-6d2429bc5373_1024x559.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!ONPA!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6af2b6b0-9a84-448f-868d-6d2429bc5373_1024x559.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ONPA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6af2b6b0-9a84-448f-868d-6d2429bc5373_1024x559.jpeg" width="1024" height="559" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6af2b6b0-9a84-448f-868d-6d2429bc5373_1024x559.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:559,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:41177,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://diamantai.substack.com/i/184320534?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6af2b6b0-9a84-448f-868d-6d2429bc5373_1024x559.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ONPA!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6af2b6b0-9a84-448f-868d-6d2429bc5373_1024x559.jpeg 424w, https://substackcdn.com/image/fetch/$s_!ONPA!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6af2b6b0-9a84-448f-868d-6d2429bc5373_1024x559.jpeg 848w, https://substackcdn.com/image/fetch/$s_!ONPA!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6af2b6b0-9a84-448f-868d-6d2429bc5373_1024x559.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!ONPA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6af2b6b0-9a84-448f-868d-6d2429bc5373_1024x559.jpeg 1456w" sizes="100vw"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.diamant-ai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.diamant-ai.com/subscribe?"><span>Subscribe now</span></a></p><h2>Your Development Partner Lives in the Terminal</h2><p>Picture working with a senior developer who never gets tired, can read thousands of files in seconds, and has instant access to the entire internet. That&#8217;s Claude Code. It connects Anthropic&#8217;s AI models directly to your project through the command line. You describe what you need in plain language, and it plans solutions, writes across multiple files, runs tests, and implements features.</p><p>But here&#8217;s what makes it different from every other coding tool: it actually understands context. Not just syntax highlighting or function signatures. Real context. It reads your project structure, sees your existing patterns, runs your tools, and even fetches information from external sources when needed.</p><p>The catch is this: giving it instructions is a skill. A learnable skill, but a skill nonetheless. The difference between getting mediocre results and getting genuinely useful code comes down to how you communicate and how you structure your workflow.</p><h2>The One Rule That Changes Everything</h2><p>Here&#8217;s where most people go wrong immediately: they start coding right away. It&#8217;s like walking into a contractor&#8217;s office and saying &#8220;start building my house&#8221; without showing blueprints, discussing materials, or even agreeing on what kind of house you want.</p><p>The result? You&#8217;ll get a house. It might even have walls and a roof. But it probably won&#8217;t be what you imagined.</p><p>Always start in plan mode. Before giving any instructions, press shift-tab to cycle into planning mode. Tell Claude to explore your codebase first, but specifically tell it not to write anything yet. Let it read the relevant files, understand the architecture, and grasp the bigger picture.</p><p>Once it&#8217;s explored, ask for a proposal. Not the simplest solution, not the fastest solution. Ask it to think through options starting with the most straightforward approach. Then discuss that plan like you would with a colleague. Question assumptions. Refine the approach. Push back if something seems off.</p><p>Only after you&#8217;re confident it understands the task should you tell it to start coding.</p><p>This feels slower at first. Your instinct will be to just dive in and start building. Resist that instinct. Planning five minutes saves fixing broken implementations for an hour. Every single time.</p><h2>Precision Beats Brevity Every Time</h2><p>Vague instructions produce vague results. Say &#8220;fix the bug&#8221; and you might get a fix, or you might get a complete rewrite that breaks three other features. There&#8217;s no middle ground here.</p><p>Instead, be surgical with your instructions. Point to specific files. Reference exact functions. Mention line numbers if you have them. Compare these two approaches:</p><p>&#8220;Fix the authentication issue.&#8221;</p><p>versus</p><p>&#8220;In the login.js file in the auth folder, update the token validation function to handle expired tokens without crashing.&#8221;</p><p>The second version leaves no room for misinterpretation. It guides Claude exactly where to look and what to do.</p><p>This precision applies to style and patterns too. If you want code that matches your existing codebase, say so explicitly. Point Claude to well-written examples in your project. It can mirror patterns beautifully, but only when you show it the pattern you want.</p><p>Think of it like directing a movie. You wouldn&#8217;t tell an actor &#8220;do something emotional.&#8221; You&#8217;d say &#8220;show hesitation, then determination, with a slight smile at the end.&#8221; Same energy here.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.diamant-ai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.diamant-ai.com/subscribe?"><span>Subscribe now</span></a></p><h2>Your Most Powerful Tool Is the Escape Key</h2><p>Claude works best as a collaborative partner, not an autonomous robot. The Escape key keeps you in control.</p><p>See Claude heading down the wrong path? Hit Escape immediately. This stops it mid-process while keeping all the context intact. You can redirect without losing the work already done. It&#8217;s like tapping someone on the shoulder mid-sentence and saying &#8220;wait, different approach.&#8221;</p><p>Double-tap Escape to jump backward through your conversation history. This lets you edit a previous prompt and explore an alternative direction. You can iterate on the same problem multiple times, trying different solutions until one clicks.</p><p>If Claude makes changes you don&#8217;t like, just tell it to undo them. It can revert files instantly. Combined with regular checkpoints, this means you can experiment fearlessly. The safety net is always there.</p><h2>Understanding the Different Modes</h2><p>Claude Code has multiple modes, and knowing when to use each one separates beginners from experts.</p><p>Plan mode is for thinking, not doing. Use it when starting new features or untangling complex problems. It will architect solutions without touching your files. This is your strategy phase.</p><p>Code mode is for building. Once you have a solid plan, switch to code mode and let it implement. But stay alert. Watch what it&#8217;s doing and be ready to course-correct.</p><p>Auto-accept mode removes the approval step for each change. It&#8217;s fantastic for straightforward tasks but dangerous for anything complex or important. For critical work, stay manual and review everything.</p><p>Bash mode lets you run terminal commands and feed the output directly into Claude&#8217;s context. This is debugging gold. Run your tests, capture the failures, and immediately ask Claude to fix them without copying error messages around.</p><p>Each mode has its place. The trick is recognizing which situation calls for which mode.</p><h2>Managing Context Before It Manages You</h2><p>Claude Code&#8217;s biggest weakness is context window limits. As sessions grow longer, it starts forgetting earlier information. Power users have strategies to handle this.</p><p>Use the /compact command regularly. It clears old execution results while keeping the important conversation history. Think of it like cleaning your desk: you keep the critical documents but toss the scrap paper.</p><p>For complex projects, create a CLAUDE.md file in your project root. This becomes permanent memory. Put your project overview, architecture decisions, coding standards, and common patterns there. Claude reads it automatically and uses it as context for every task. It&#8217;s like giving every session a primer on how your project works.</p><p>For massive tasks, use a checklist file. Create a markdown document with all the steps needed to complete the task. Tell Claude to use it as a scratchpad, checking off items as it progresses. This keeps the main conversation clean while giving Claude a progress tracker.</p><h2>Divide Complex Work with Subagents</h2><p>When facing a genuinely complex problem, break it apart and assign pieces to different subagents. Tell Claude to spin up a subagent for the backend API while the main agent handles the frontend. Or have one subagent research documentation while another writes implementation code.</p><p>You can even mention subagents directly with the @ symbol to guarantee they activate. You can also specify which model each subagent should use. Opus 4 handles complex planning and architecture. Haiku 3.5 knocks out simple, fast tasks.</p><p>This approach tackles problems in parallel and keeps context focused. Each subagent deals with one slice of the problem without getting overwhelmed by the full complexity. It&#8217;s like having multiple specialists working on different parts of a project simultaneously.</p><h2>Show, Don&#8217;t Tell</h2><p>Claude Code can interpret visual information. Drag screenshots directly into your terminal. Show it UI mockups, error messages, or architecture diagrams. It will understand the visual context and use it to guide implementation.</p><p>This is especially powerful for debugging interface issues. Instead of describing what&#8217;s wrong with your layout, just show a screenshot. For replicating designs, provide the mockup and let Claude figure out the implementation details.</p><p>Visual context often communicates more than words ever could. A single screenshot can replace three paragraphs of explanation. Use this liberally.</p><h2>Automate Everything, Then Automate the Automation</h2><p>Claude Code excels at repetitive tasks. But power users go further: they automate the automation itself.</p><p>Set up custom slash commands for tasks you repeat constantly. Create a command that loads your project context, runs your test suite, and generates documentation in one go.</p><p>Use hooks to trigger actions automatically. Run tests after every code change. Lint before commits. Update documentation when finishing features. These small automations compound into massive time savings.</p><p>For data processing pipelines, integrate Claude directly into your workflow. Pipe data in, let it transform or analyze the data, and pipe the output to the next step. This turns Claude into a powerful processing node in your toolchain.</p><h2>Extended Thinking for Complex Problems</h2><p>For genuinely difficult problems, use extended thinking commands like /think or /ultrathink. These increase Claude&#8217;s reasoning budget, giving it more time to work through complicated challenges.</p><p>Yes, it takes longer. But the quality difference is dramatic for debugging, architecture planning, and design decisions. It&#8217;s the difference between asking for a quick answer versus asking someone to really think through a problem thoroughly.</p><p>The ultrathink command is particularly powerful. It provides the maximum thinking budget, perfect for architectural decisions or bugs that have stumped you for hours. Use it sparingly, but when you need it, you really need it.</p><h2>Git Workflows That Keep You Safe</h2><p>Never work directly on your main branch with Claude Code. Always create a feature branch first. This gives you a safe sandbox to experiment in.</p><p>Even better, use Git worktrees. This lets you maintain multiple working directories for different branches, so you can have Claude working on several features in parallel without interference.</p><p>When Claude finishes a task, have it commit changes with a clear message explaining what was done. Then review the commit diff carefully before merging. This workflow gives you the safety of version control while letting Claude work autonomously.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.diamant-ai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.diamant-ai.com/subscribe?"><span>Subscribe now</span></a></p><h2>Embed Your Standards in Documentation</h2><p>Instead of reminding Claude about coding standards in every conversation, embed them directly in documentation files. Create a QUALITY.md file with your coding standards, testing requirements, and review checklist.</p><p>Claude will read this automatically and follow your standards without being told. It becomes part of the project context, like a senior developer who knows the house rules and follows them instinctively.</p><p>For teams, this ensures consistency across all Claude Code sessions. Everyone gets the same quality bar, regardless of who&#8217;s running the tool.</p><h2>The MCP Revolution</h2><p>Model Context Protocol servers extend Claude Code&#8217;s capabilities dramatically. Connect it to your Slack, Figma, Google Drive, or custom data sources. This transforms Claude from a code assistant into a genuine team member that can pull information from anywhere.</p><p>Need to check the latest design mockup? Claude fetches it from Figma. Need to understand a business requirement? It pulls it from Slack.</p><p>Set up MCP servers for your most-used tools. The initial setup takes time, but the payoff is enormous. Claude becomes infinitely more capable when it can access your actual data sources.</p><h2>Debugging Strategy</h2><p>Claude Code is exceptional at debugging when you give it proper information. When you hit a bug, don&#8217;t just paste the error message. Use bash mode to run your tests and feed the full output to Claude. Tell it to analyze the stack trace, read the relevant files, and propose a fix.</p><p>For intermittent bugs, run the failing code multiple times and give Claude all the outputs. It can spot patterns in failures that humans miss.</p><p>If bugs involve external services, use Claude to fetch relevant documentation or logs. It can correlate error messages with API documentation to pinpoint exactly what&#8217;s wrong.</p><h2>Self-Writing Documentation</h2><p>One of Claude Code&#8217;s most underrated features is documentation generation. After finishing a feature, tell Claude to update the README, API docs, and changelog. It has full context of what was just built, so it writes accurate, comprehensive documentation without requiring explanation.</p><p>This is especially powerful after refactors, where documentation typically gets forgotten. Set up a hook to automatically generate documentation after every feature merge. Your docs will stay synchronized with your code effortlessly.</p><h2>Managing Token Usage in Long Sessions</h2><p>Long Claude Code sessions can get expensive as context grows. Smart users manage this proactively.</p><p>Break large tasks into smaller chunks. Complete one chunk, commit it, then start a fresh session for the next chunk. This keeps context size manageable and costs reasonable.</p><p>Use prompt caching for information that doesn&#8217;t change often. Load your project overview and standards once, then reference them in subsequent sessions. This dramatically reduces token usage.</p><p>For repetitive tasks across many files, use a script to process them in batches rather than one giant session. This parallel approach is both faster and more cost-effective.</p><h2>The Checklist Method for Large Migrations</h2><p>For migrations, massive refactors, or fixing hundreds of lint errors, the checklist method is unbeatable.</p><p>Create a markdown file listing every task that needs completion. Tell Claude to use this as its working document, checking off items as it completes them and adding notes about any issues.</p><p>This approach does two crucial things. First, it gives Claude a clear roadmap, preventing it from getting lost in complexity. Second, it lets you track progress and see exactly what&#8217;s been done.</p><p>For truly large codebases, break the checklist into sections and tackle them in separate sessions. This keeps each session focused and productive.</p><h2>Accelerating Learning and Onboarding</h2><p>Claude Code is an incredible learning tool. New team members can ask it to explain the codebase, trace through execution flows, and understand architecture decisions.</p><p>Have newcomers ask Claude to map out the project structure and identify key components. Then they can ask specific questions about how things work. This accelerates onboarding from weeks to days.</p><p>For existing team members exploring unfamiliar parts of the codebase, Claude provides guided tours. Ask it to explain the authentication flow or the data pipeline, and it will trace through the code, explaining each piece clearly.</p><h2>Beyond the Code</h2><p>Claude Code can do much more than write software. Use it for research tasks, like reading documentation and creating summaries for future reference. It can analyze logs, process data files, and generate reports.</p><p>Need to understand a new API? Have Claude read the documentation and create a usage guide. Working with a large CSV file? Pipe it into Claude and ask for analysis.</p><p>These non-coding tasks often consume huge amounts of developer time. Claude can handle them while you focus on the creative problem-solving that actually requires human intelligence.</p><h2>Avoiding Common Traps</h2><p>Even experienced users make mistakes. Here are the most frequent ones and how to sidestep them.</p><p>Trusting auto-accept mode for complex tasks is dangerous. Auto-accept is convenient but risky for anything affecting core functionality. Always review changes manually for important work.</p><p>Letting sessions run too long accumulates context and makes everything slower and more expensive. Refresh your session regularly, especially after completing major milestones.</p><p>Not using version control is asking for trouble. Always use branches, and always review diffs before merging.</p><p>Being too vague leads to assumptions. Those assumptions might not match your intent. Take time to be precise.</p><p>Ignoring the plan phase might feel faster, but it leads to rework. The few minutes spent planning save hours of fixing wrong implementations.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.diamant-ai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading <strong>&#128142;DiamantAI!</strong> I share cutting-edge AI insights, tutorials, and breakthroughs. Subscribe for free to get new posts delivered straight to your inbox, and as a bonus, you&#8217;ll receive a <strong>33% discount coupon</strong> for both of my digital books, <em><a href="https://nirdiamant.gumroad.com/l/rag-made-simple">RAG Made Simple</a></em> and <em><a href="https://nirdiamant.gumroad.com/l/mtxrfk">Prompt Engineering: From Zero to Hero</a></em>. Enjoy!</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Why AI Agents Need to Check Their Own Work]]></title><description><![CDATA[Quick announcements before we dive in:]]></description><link>https://newsletter.diamant-ai.com/p/why-ai-agents-need-to-check-their</link><guid isPermaLink="false">https://newsletter.diamant-ai.com/p/why-ai-agents-need-to-check-their</guid><dc:creator><![CDATA[Nir Diamant]]></dc:creator><pubDate>Mon, 01 Dec 2025 00:01:34 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!ImDF!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58265f29-f15a-45a4-be2c-13ae974b902b_1024x1536.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><strong>Quick announcements before we dive in:</strong></p><p>I&#8217;m heading to AWS re:Invent in Las Vegas this week! If you&#8217;re attending, I&#8217;d love to connect in person. Feel free to shoot me a message on <a href="https://www.linkedin.com/in/nir-diamant-ai/">LinkedIn</a> and let&#8217;s meet up.</p><p>Also, for Black Friday I&#8217;m offering special discounts on sponsorships and collaborations through December 6th. If you&#8217;re interested in partnering, check out the <a href="https://www.diamant-ai.com/sponsorship">current options</a> and reach out.</p><div><hr></div><p>Picture yourself baking a cake from a new recipe. You mix the ingredients, slide the pan into the oven, and set a timer. What happens next separates amateur bakers from experienced ones. An amateur walks away and waits for the timer. An experienced baker peeks through the oven door, tests with a toothpick, and adjusts the temperature if things look off. That simple act of checking and adjusting is a feedback loop. It&#8217;s how we make sure a process stays on track.</p><p>Now imagine an AI agent trying to schedule your meetings, research a complex topic, or plan your weekend trip. Without the ability to check its own work and adjust course, that agent is like the amateur baker who just sets a timer and hopes for the best. Sometimes it works. Often it doesn&#8217;t.</p><p>This is where semantic control loops change everything. They give AI agents the ability to peek through the oven door, so to speak. They transform aimless assistants into focused problem-solvers that can adapt when reality doesn&#8217;t match the plan. Let&#8217;s explore how this works and why it matters.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.diamant-ai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.diamant-ai.com/subscribe?"><span>Subscribe now</span></a></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ImDF!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58265f29-f15a-45a4-be2c-13ae974b902b_1024x1536.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ImDF!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58265f29-f15a-45a4-be2c-13ae974b902b_1024x1536.png 424w, https://substackcdn.com/image/fetch/$s_!ImDF!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58265f29-f15a-45a4-be2c-13ae974b902b_1024x1536.png 848w, https://substackcdn.com/image/fetch/$s_!ImDF!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58265f29-f15a-45a4-be2c-13ae974b902b_1024x1536.png 1272w, https://substackcdn.com/image/fetch/$s_!ImDF!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58265f29-f15a-45a4-be2c-13ae974b902b_1024x1536.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ImDF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58265f29-f15a-45a4-be2c-13ae974b902b_1024x1536.png" width="1024" height="1536" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/58265f29-f15a-45a4-be2c-13ae974b902b_1024x1536.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1536,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:3440370,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://diamantai.substack.com/i/180320914?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58265f29-f15a-45a4-be2c-13ae974b902b_1024x1536.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ImDF!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58265f29-f15a-45a4-be2c-13ae974b902b_1024x1536.png 424w, https://substackcdn.com/image/fetch/$s_!ImDF!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58265f29-f15a-45a4-be2c-13ae974b902b_1024x1536.png 848w, https://substackcdn.com/image/fetch/$s_!ImDF!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58265f29-f15a-45a4-be2c-13ae974b902b_1024x1536.png 1272w, https://substackcdn.com/image/fetch/$s_!ImDF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F58265f29-f15a-45a4-be2c-13ae974b902b_1024x1536.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Feedback Loops in Everyday Life</h2><p>Consider your home thermostat. You set it to 22&#176;C. The thermostat measures the actual temperature. Too cold? Heat turns on. Too warm? Heat turns off. This happens constantly, creating a continuous cycle of measuring, comparing, and adjusting.</p><p>This simple device embodies a powerful idea called a feedback loop. It senses the current state, compares it to the desired state, and acts to close the gap. The thermostat doesn&#8217;t understand why you want 22&#176;C or care about your comfort. It just knows how to keep adjusting until the numbers match.</p><p>Feedback loops appear everywhere once you start looking. Autopilot systems keep aircraft level. Cruise control maintains your car&#8217;s speed. Your inner ear helps you balance without falling. Each system follows the same pattern: sense, compare, adjust, repeat.</p><p>But here&#8217;s the thing. All these examples deal with numbers. Temperature. Speed. Altitude. Balance. What happens when the goal isn&#8217;t a number at all?</p><h2>Beyond Numbers</h2><p>Say you ask an AI assistant to schedule a meeting with Bob and Carol for next week. This task has layers of complexity that a simple thermostat could never handle. The AI needs to understand who Bob and Carol are, check multiple calendars, find a time that works for everyone, send invites, and confirm attendance.</p><p>Success isn&#8217;t hitting a target number. Success means everyone can actually attend. That requires understanding the meaning behind the task, not just matching signals.</p><p>A human assistant would instinctively notice conflicts and adjust. They understand what &#8220;successfully scheduling a meeting&#8221; actually means beyond the mechanical steps. The question becomes: how can an AI develop that same intuition?</p><p>This is where feedback loops evolve from mechanical to meaningful.</p><h2>Understanding Meaning</h2><p>Semantic control loops describe feedback that goes beyond numbers. The loop checks whether actions are fulfilling the intended goal, not just matching a measurement.</p><p>For a thermostat, both the goal and the feedback are numbers. Current temperature versus target temperature. Simple math.</p><p>For an AI scheduling meetings, the goal involves understanding. Did everyone confirm? Are there conflicts? Does the timing make sense? The feedback isn&#8217;t a number. It&#8217;s about whether reality matches the intended outcome.</p><p>This means the agent must grasp what success looks like. A confirmed meeting isn&#8217;t just sent invites. It&#8217;s everyone saying yes. Any gap between intention and reality becomes a signal to adjust strategy. Maybe propose a different time. Send a reminder. Ask about availability first.</p><h2>Continuous Adaptation</h2><p>Modern AI agents operate in a continuous loop. Think about what to do. Act on that plan. Check the results. If the outcome isn&#8217;t right yet, loop back and try something different based on what was learned.</p><p>This differs fundamentally from older approaches. Early software followed rigid scripts. Step one, step two, step three. Like following a recipe with no room for adjustment. These open-loop systems had no mechanism to incorporate feedback during execution. If something unexpected happened, they either failed or got stuck.</p><p>Imagine a cleaning robot following a fixed script. It always takes the same path through your home, no matter what. New furniture in the way? It gets stuck. Unexpected spill? It drives right past. The robot executes its program without reflection.</p><p>Now imagine a smarter robot with a semantic feedback loop. It senses obstacles and adapts its path. It notices the spill and pauses to clean it. The robot&#8217;s true goal isn&#8217;t following a specific path. It&#8217;s cleaning your home. The feedback loop keeps that goal in focus and allows flexibility in achieving it.</p><p>The first robot is like an actor reciting memorized lines. The second is like an improviser who stays true to the story while adapting to what happens on stage.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.diamant-ai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.diamant-ai.com/subscribe?"><span>Subscribe now</span></a></p><h2>Self-Correction Matters</h2><p>Without self-checking feedback, AI agents easily drift off course, get trapped in loops, or confidently produce answers that miss the point entirely.</p><p>Think about an AI researching a topic and writing a report. Without a feedback loop, it might grab the first information it finds, draft an answer, and call it done. The report exists, so mission accomplished, right?</p><p>With a semantic control loop, the agent reads its own draft. It realizes key points are missing or irrelevant information crept in. So it loops back. Gather more data. Reorganize the content. Check again. Keep refining until the report actually answers the question well.</p><p>This isn&#8217;t just working faster. It&#8217;s working smarter. The feedback guides each step, turning a mechanical process into something that resembles understanding.</p><h2>Planning a Mountain Trip</h2><p>Let&#8217;s walk through a real example. You ask an AI assistant to plan a weekend mountain trip. This task has many moving pieces: choosing a destination, booking travel and lodging, checking weather, suggesting activities. You expect a coherent plan at the end.</p><p>The assistant starts by finding a popular mountain town and some hotels. Then it checks availability and discovers every hotel is fully booked. Without a feedback loop, the assistant might just present this broken plan. Hotels exist in the database, task complete.</p><p>But with a semantic control loop, the assistant notices the gap between intention and reality. The goal isn&#8217;t listing hotels. It&#8217;s creating a workable trip. So the assistant adapts. Maybe it tries a different town or different dates. Perhaps it looks at cabin rentals instead of hotels. It keeps proposing solutions and evaluating them, checking each time whether all the pieces fit together.</p><p>This cycle continues until everything aligns. Destination, lodging, transportation, weather-appropriate activities. You receive a full itinerary that actually works because the AI kept adjusting whenever reality didn&#8217;t match the goal.</p><h2>Four Key Benefits</h2><p>Semantic control loops make AI agents robust, reliable, and genuinely useful. When surprises happen, the agent adapts instead of breaking. It stays focused on what you actually want rather than veering into irrelevant territory. You can follow its reasoning step by step, making the process transparent and debuggable. Each feedback cycle teaches the agent something new, building competence over time.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.diamant-ai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading <strong>&#128142;DiamantAI!</strong> I share cutting-edge AI insights, tutorials, and breakthroughs. Subscribe for free to get new posts delivered straight to your inbox, and as a bonus, you&#8217;ll receive a <strong>33% discount coupon</strong> for both of my digital books, <em><a href="https://nirdiamant.gumroad.com/l/rag-made-simple">RAG Made Simple</a></em> and <em><a href="https://nirdiamant.gumroad.com/l/mtxrfk">Prompt Engineering: From Zero to Hero</a></em>. Enjoy!</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p></p>]]></content:encoded></item><item><title><![CDATA[Context Engineering - How AI Turns Email Chaos into Searchable Intelligence]]></title><description><![CDATA[Your inbox holds thousands of conversations.]]></description><link>https://newsletter.diamant-ai.com/p/context-engineering-how-ai-turns</link><guid isPermaLink="false">https://newsletter.diamant-ai.com/p/context-engineering-how-ai-turns</guid><dc:creator><![CDATA[Nir Diamant]]></dc:creator><pubDate>Mon, 03 Nov 2025 13:02:51 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!nmQV!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4660b1a1-6b97-46d7-98da-6a0c83a6d6ce_1880x974.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Your inbox holds thousands of conversations. Buried somewhere in those threads is the contract your client mentioned last week, the deployment timeline your team debated last month, and the budget decision that happened across three separate email chains. When you need to find something, you don&#8217;t just search for keywords. You remember context: who said it, when it happened, what else was going on at the time.</p><p>This is what separates finding an email from understanding your email. Traditional search gives you messages containing certain words. Intelligence gives you meaning, relationships, and answers synthesized from the full story of your communications.</p><p>Most email tools are fancy filing cabinets. You can search for keywords, filter by sender or date, and organize with labels. But they can&#8217;t read context across threads, understand what decisions were made, or connect a customer complaint from January with the product fix discussed in March. Your email contains valuable intelligence, but until recently, no system could extract it.</p><p>A new generation of tools treats your inbox as a living knowledge base. These systems read full conversations like humans do, understand relationships between threads, extract decisions and tasks, and answer questions by pulling together information from dozens of emails.</p><p>I recently joined <a href="https://www.igpt.ai/?utm_source=nir_diamant">iGPT</a> as a partner, and the technology behind it is fascinating. The system uses sophisticated context engineering to turn email and workplace data into structured intelligence. They&#8217;re currently accepting people to their <a href="https://www.igpt.ai/?utm_source=nir_diamant">waiting list</a> if you want to try it.</p><p>What follows is a technical deep dive into iGPT&#8217;s algorithm and how it actually works under the hood.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!nmQV!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4660b1a1-6b97-46d7-98da-6a0c83a6d6ce_1880x974.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!nmQV!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4660b1a1-6b97-46d7-98da-6a0c83a6d6ce_1880x974.png 424w, https://substackcdn.com/image/fetch/$s_!nmQV!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4660b1a1-6b97-46d7-98da-6a0c83a6d6ce_1880x974.png 848w, https://substackcdn.com/image/fetch/$s_!nmQV!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4660b1a1-6b97-46d7-98da-6a0c83a6d6ce_1880x974.png 1272w, https://substackcdn.com/image/fetch/$s_!nmQV!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4660b1a1-6b97-46d7-98da-6a0c83a6d6ce_1880x974.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!nmQV!,w_2400,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4660b1a1-6b97-46d7-98da-6a0c83a6d6ce_1880x974.png" width="1200" height="621.4285714285714" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4660b1a1-6b97-46d7-98da-6a0c83a6d6ce_1880x974.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:false,&quot;imageSize&quot;:&quot;large&quot;,&quot;height&quot;:754,&quot;width&quot;:1456,&quot;resizeWidth&quot;:1200,&quot;bytes&quot;:58309,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://diamantai.substack.com/i/177798990?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4660b1a1-6b97-46d7-98da-6a0c83a6d6ce_1880x974.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:&quot;center&quot;,&quot;offset&quot;:false}" class="sizing-large" alt="" srcset="https://substackcdn.com/image/fetch/$s_!nmQV!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4660b1a1-6b97-46d7-98da-6a0c83a6d6ce_1880x974.png 424w, https://substackcdn.com/image/fetch/$s_!nmQV!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4660b1a1-6b97-46d7-98da-6a0c83a6d6ce_1880x974.png 848w, https://substackcdn.com/image/fetch/$s_!nmQV!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4660b1a1-6b97-46d7-98da-6a0c83a6d6ce_1880x974.png 1272w, https://substackcdn.com/image/fetch/$s_!nmQV!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4660b1a1-6b97-46d7-98da-6a0c83a6d6ce_1880x974.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Syncing Millions of Messages Without Chaos</h2><p>Imagine a moving company transferring everything from your old apartment. They can&#8217;t just grab random boxes. They need a strategy: start with essentials you&#8217;ll need tonight, then work systematically through everything else, while staying alert for new items you&#8217;re still packing.</p><p>This is exactly what happens when you first connect your email account to iGPT. The system might need to sync millions of messages, and it has to be smart about priorities.</p><p>The sync runs in two directions simultaneously. It fetches existing emails starting from the newest and working backward, since recent conversations matter most. But while this historical sync happens, new emails keep arriving. These fresh messages jump to the front of the line, getting processed within seconds so the system stays current.</p><p>For a typical email without attachments, the system makes it searchable in about one second. Complex emails with PDF attachments that need text extraction might take 20 seconds, but this happens in the background without slowing anything down. The system works with all major email providers through their APIs or the universal IMAP protocol, pulling everything: email bodies, metadata, and every attachment.</p><h2>Extracting Clean Content from Messy Email</h2><p>Raw email is incredibly messy. HTML newsletters have complex formatting. Reply chains quote the same text five levels deep. Email signatures bloat every message. Threading conventions vary by email client.</p><p>The trickiest challenge is handling email threads. When you reply to an email, your client typically includes the entire previous conversation below your response. If ten people exchange messages, the final email contains all nine previous messages nested inside it. A basic system would treat each reply as separate, creating massive duplication.</p><p>The solution is clever. Because sync starts from newest emails first, the system often sees a recent reply that quotes several older messages before encountering those originals. It processes what it has but marks the quoted sections. Then, as sync continues backward through time and reaches the original emails, it revisits the newer messages and strips out duplicates. This iterative cleaning continues until every message in the thread appears cleanly exactly once.</p><p>HTML emails get converted to clean Markdown format, preserving structure like headers and lists while removing styling clutter. Newsletters go through an algorithm that extracts actual article content from navigation menus and footers, similar to browser reading modes. The system ignores spam and trash automatically but keeps newsletters since they might contain work-relevant information.</p><div class="native-video-embed" data-component-name="VideoPlaceholder" data-attrs="{&quot;mediaUploadId&quot;:&quot;2d44fa1a-6e47-4030-a14a-7a3308b51d19&quot;,&quot;duration&quot;:null}"></div><h2>How Hybrid Search Finds What You Actually Mean</h2><p>You ask: &#8220;What did my team decide about the API redesign?&#8221; This question lacks obvious keywords. &#8220;API redesign&#8221; might appear in subject lines, but the actual decision could be buried in a message saying &#8220;Let&#8217;s go with Option B&#8221; without ever repeating those exact terms.</p><p>Traditional keyword search falls short here. You need a system that understands meaning, not just matches words.</p><p>The approach uses hybrid search, combining three complementary methods. Full-text search finds emails that literally contain your search terms. This catches exact matches and runs incredibly fast.</p><p>Semantic search operates deeper. The system converts all emails into numerical representations that capture meaning. When you search, your question gets the same treatment, and the system finds emails that are semantically similar even with different words. A search for &#8220;API redesign decision&#8221; might surface an email discussing &#8220;endpoint architecture consensus&#8221; because the models understand these concepts relate.</p><p>Filter-based search handles structured queries about dates, senders, or metadata. &#8220;What did Sarah say about this last month&#8221; becomes filters that narrow results before semantic matching begins.</p><p>Each method produces candidates with confidence scores. The system combines these intelligently. An email matching both semantic meaning and keywords scores higher than one matching only semantically. Then comes reranking, where a specialized model reassesses all candidates in context of your specific question, boosting truly relevant results and filtering false positives.</p><h2>Assembling Context That Makes Sense</h2><p>Finding relevant emails is only half the work. Now the system must construct context that helps a language model actually answer your question.</p><p>Think of a museum curator building an exhibition. They have thousands of artifacts in storage but carefully select specific pieces, arrange them meaningfully, and provide labels that help visitors understand what they&#8217;re seeing.</p><p>The system reconstructs retrieved emails into a coherent narrative. It includes essential metadata: who sent each message, when, and the subject. It organizes messages chronologically when showing a thread&#8217;s evolution, or thematically when multiple threads discuss the same topic. For long threads, it identifies which parts matter most to your question rather than including everything.</p><p>This assembled context gets structured so the language model can cite sources properly. Each piece of information is tagged with its origin. When the model generates an answer, it points back to specific emails. You&#8217;re one click away from verifying any claim or diving deeper into the original conversation.</p><p>The system manages token limits dynamically. If retrieval pulls dozens of relevant emails that would exceed the language model&#8217;s capacity, it summarizes less critical portions while preserving the most relevant details in full. This balancing act considers technical limits, cost, and speed.</p><p>Achieving reliable citations required serious engineering effort. The system must track which parts of its answer came from which sources even as the language model synthesizes information from multiple emails. Getting this right means every claim can be verified instantly, transforming the experience from &#8220;the AI told me something&#8221; to &#8220;here&#8217;s what actually happened, with proof.&#8221;</p><h2>Privacy, Security, and Speed at Scale</h2><p>All email data sits encrypted at rest with per-user and per-message encryption keys. The system never uses your data for model training. When you disconnect your account, everything gets deleted within 24 hours after confirming it wasn&#8217;t just a temporary authentication problem.</p><p>Despite all this processing, retrieval, ranking, context assembly, and generation, the system targets about three seconds from question to answer.</p><p>The initial retrieval searching through millions of emails completes in under 100 milliseconds. This speed comes from careful indexing and optimization. Recency-based caching keeps frequently accessed recent emails hot in memory. Parallel processing means multiple steps happen simultaneously rather than sequentially.</p><p>The system constantly evaluates performance, switching between different models and approaches when something faster or better becomes available. It can even run multiple embedding models in parallel during transitions so service never gets disrupted during upgrades.</p><p>Users with millions of emails see no performance degradation. The architecture scales because each step is optimized independently: fast retrieval, efficient reranking, smart context assembly, and careful token management.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.diamant-ai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading <strong>&#128142;DiamantAI!</strong> I share cutting-edge AI insights, tutorials, and breakthroughs. Subscribe for free to get new posts delivered straight to your inbox, and as a bonus, you&#8217;ll receive a <strong>33% discount coupon</strong> for both of my digital books, <em><a href="https://nirdiamant.gumroad.com/l/rag-made-simple">RAG Made Simple</a></em> and <em><a href="https://nirdiamant.gumroad.com/l/mtxrfk">Prompt Engineering: From Zero to Hero</a></em>. Enjoy!</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p></p>]]></content:encoded></item><item><title><![CDATA[How to Stop AI Hallucinations]]></title><description><![CDATA[10 Proven Techniques That Actually Work]]></description><link>https://newsletter.diamant-ai.com/p/how-to-stop-ai-hallucinations</link><guid isPermaLink="false">https://newsletter.diamant-ai.com/p/how-to-stop-ai-hallucinations</guid><dc:creator><![CDATA[Nir Diamant]]></dc:creator><pubDate>Tue, 07 Oct 2025 12:03:08 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!i5mS!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ec8339b-be20-4bdc-9abd-113f247166ed_1536x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Picture a confident storyteller who never admits uncertainty. Ask them about anything, and they&#8217;ll give you an answer that sounds completely plausible. The problem? Sometimes they&#8217;re just filling gaps with pure invention.</p><p>This is what happens when AI language models hallucinate. They generate text that sounds authoritative but has no connection to reality. An AI confidently invented fake legal cases for a lawyer, leading to courtroom disaster. A search chatbot made up telescope discoveries in front of the world. In customer service, medical advice, or legal assistance, these fabrications cause real harm.</p><p>The AI doesn&#8217;t lie with malice. It simply doesn&#8217;t know the difference between what it learned during training and what it&#8217;s creating on the spot to complete a pattern. Modern language models predict the next most likely word based on patterns. When they encounter gaps in knowledge, they don&#8217;t pause or admit uncertainty. They keep predicting words that sound right, creating fiction that feels like fact.</p><p>Fortunately, researchers and developers have discovered practical ways to keep AI grounded in truth. These strategies range from simple adjustments anyone can make to sophisticated training techniques. Let&#8217;s explore how to turn an imaginative storyteller into a reliable assistant.</p><div><hr></div><p><strong>Sponsored:</strong> Speaking of reliable AI, <a href="https://www.parlant.io/?utm_source=Nir&amp;utm_medium=Newletter">Parlant</a> is an AI agent framework designed to make your agents follow instructions consistently. Instead of wrestling with unpredictable behavior through complex prompts, Parlant lets you define behavioral guidelines in natural language that your agents actually follow. Whether you&#8217;re building customer service bots or domain-specific assistants, it helps you create predictable, rule-following agents without constant debugging.</p><div><hr></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!i5mS!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ec8339b-be20-4bdc-9abd-113f247166ed_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!i5mS!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ec8339b-be20-4bdc-9abd-113f247166ed_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!i5mS!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ec8339b-be20-4bdc-9abd-113f247166ed_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!i5mS!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ec8339b-be20-4bdc-9abd-113f247166ed_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!i5mS!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ec8339b-be20-4bdc-9abd-113f247166ed_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!i5mS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ec8339b-be20-4bdc-9abd-113f247166ed_1536x1024.png" width="1456" height="971" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7ec8339b-be20-4bdc-9abd-113f247166ed_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2487344,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://diamantai.substack.com/i/175470721?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ec8339b-be20-4bdc-9abd-113f247166ed_1536x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!i5mS!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ec8339b-be20-4bdc-9abd-113f247166ed_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!i5mS!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ec8339b-be20-4bdc-9abd-113f247166ed_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!i5mS!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ec8339b-be20-4bdc-9abd-113f247166ed_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!i5mS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7ec8339b-be20-4bdc-9abd-113f247166ed_1536x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>1. Choose Advanced Models</h2><p>Not all AI models are created equal. Newer, more advanced models typically hallucinate less because they&#8217;ve been trained on better datasets and refined with improved methods. Think of it like consulting a seasoned expert versus a novice. The expert is more likely to know the facts or admit when they don&#8217;t.</p><p>A model from 2024 will generally produce more accurate, consistent answers than its 2022 counterpart. The difference isn&#8217;t subtle. You can prevent many hallucinations simply by selecting a model known for factual accuracy. Always evaluate different models on your task. You might find a noticeable drop in fabricated answers by upgrading to one with better training.</p><h2>2. Write Clear Instructions</h2><p>AI systems are remarkably sensitive to how you phrase requests. The same model can behave completely differently depending on your guidance. Explicit instructions act like guidelines, narrowing behavior and setting expectations.</p><p>Tell the AI: &#8220;Answer only with verified information. If you&#8217;re not sure, say you don&#8217;t know.&#8221; This simple instruction can dramatically change behavior. Instead of cheerfully inventing an answer to fill silence, the model might admit uncertainty or ask for clarification. It&#8217;s like telling a student that saying &#8220;I don&#8217;t know&#8221; is better than guessing.</p><p>This doesn&#8217;t work perfectly every time. Language models can still drift from instructions. But explicit prompts about accuracy requirements give the AI less room to improvise incorrectly.</p><h2>3. Use Step-by-Step Reasoning</h2><p>Remember math class? Teachers insisted you show your work, not just the final answer. Working through steps reveals whether you truly understand the problem or just got lucky with a guess.</p><p>The same principle applies to AI. When models jump straight to answers without reasoning through problems, they often make logical leaps that lead to nonsense. The solution is chain-of-thought prompting: asking AI to think out loud.</p><p>Instead of demanding an immediate answer, guide the model: &#8220;Let&#8217;s solve this step by step.&#8221; The AI then breaks down the problem, explains intermediate thinking, and builds toward a conclusion. You can even build your own logic breakdown, prescribing the exact process the model should follow. For example: &#8220;First, identify the key variables. Second, check what information is missing. Third, calculate each component separately. Finally, combine the results.&#8221;</p><p>For more control, you can implement this logic in code as a state graph. Each node represents a reasoning step, and edges define the flow between steps. The AI executes one step at a time, and your code determines what happens next based on the output. This structured approach forces consistency and self-checking along the way. For tasks involving calculations, multi-step logic, or complex reasoning, this dramatically reduces errors.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.diamant-ai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.diamant-ai.com/subscribe?"><span>Subscribe now</span></a></p><h2>4. Provide Examples</h2><p>Show the AI examples of correct behavior through few-shot prompting. Include a few sample interactions demonstrating accurate, factual responses in your prompt, and the model will mimic that style. If your examples occasionally say &#8220;I don&#8217;t know,&#8221; the AI learns that admitting uncertainty is acceptable.</p><p>It&#8217;s like giving an apprentice solved problems as guides before asking them to tackle new ones. The model follows the patterns you demonstrate. Show it high-quality examples of not inventing information, and it becomes less likely to fabricate answers. Make your examples relevant to the task and demonstrate only the behavior you want to encourage.</p><h2>5. Ground with Real Data</h2><p>AI models work from memory. They generate text based on patterns learned during training, which ended at some fixed point in the past. They don&#8217;t know what happened yesterday, and their knowledge of even older events might be imperfect.</p><p>The most powerful solution is Retrieval-Augmented Generation. Your system fetches relevant information from external sources like databases, documentation, or web searches, then provides those details to the model as context. The AI bases its answer on supplied information rather than potentially faulty memory.</p><p>Think of this as switching from a closed-book exam to an open-book one. Imagine someone asks about your company&#8217;s return policy. Instead of having the AI guess based on vague training data, your system retrieves the actual policy document and feeds it into the prompt. It&#8217;s much harder to hallucinate a fake policy when the real one is sitting right there.</p><p>This dramatically improves accuracy. Customer service bots, legal assistants, and medical advisors increasingly use this strategy. The result is trustworthy outputs that users can verify against source material.</p><h2>6. Lower the Temperature</h2><p>Language models have parameters that control how adventurous their word choices become. The temperature parameter controls this balance. High temperature encourages exaggeration, dramatic flourishes, and exploration. Low temperature means sticking to straightforward facts.</p><p>For tasks requiring accuracy, turning down the temperature helps. At lower settings, the model becomes more conservative and focused. It picks the most likely, straightforward next word rather than exploring fanciful possibilities. Responses may be plainer, but that&#8217;s usually preferable when truthfulness matters more than entertainment.</p><p>This isn&#8217;t about suppressing capabilities. It&#8217;s about matching the tool to the task. For creative writing or brainstorming, higher temperature works beautifully. For answering factual questions or generating documentation, dial it down.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.diamant-ai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.diamant-ai.com/subscribe?"><span>Subscribe now</span></a></p><h2>7. Implement Self-Checks</h2><p>After the AI generates an answer, ask it to verify: &#8220;Are you sure? Can you double-check that information?&#8221; You can even have it generate multiple independent answers to the same question and compare them. If all answers agree, confidence increases. If they diverge, that&#8217;s a warning sign.</p><p>This resembles having several people solve the same problem independently, then comparing solutions. Discrepancies reveal potential issues. Some systems automate this process, using the model&#8217;s own uncertainty or internal disagreement to flag suspicious outputs. It&#8217;s like proofreading an essay. A second read spots made-up facts or inconsistencies that the first draft contained.</p><h2>8. Add External Verification</h2><p>Instead of trusting the AI&#8217;s self-assessment, check facts against trusted databases. If the model outputs a specific statistic, your system can automatically verify it through an API or secondary source. When verification fails, flag the response, correct it, or prompt the model to try again.</p><p>This works like an editor checking a journalist&#8217;s citations before publication. In high-stakes domains like medicine or law, such guardrails become essential. They ensure questionable claims get caught and corrected rather than reaching users unchecked.</p><p>Rule-based frameworks can enforce boundaries too. Define what the AI is and isn&#8217;t allowed to do. Require source attribution for certain claims. Prevent responses on topics outside the model&#8217;s expertise. These constraints act as safety nets, intervening when the AI starts straying.</p><h2>9. Fine-Tune on Your Domain</h2><p>Sometimes the solution is making the model itself more knowledgeable. Fine-tuning takes a general-purpose language model and trains it further on curated data from your specific domain.</p><p>Building a medical chatbot? Fine-tune on verified medical literature and documentation. The model learns the jargon, correct facts, and appropriate style for that field. It becomes less likely to produce wild guesses because it has deeper, more accurate knowledge.</p><p>This is like sending someone to specialized school. A lawyer trained in contract law won&#8217;t confidently make up facts about surgery because they know their domain and its boundaries. Similarly, a fine-tuned model understands what it should know and where its expertise ends.</p><p>The process requires quality training data and computational resources, but the payoff is AI aligned with reality in your use case. Many specialized models exist for different domains. Even if you can&#8217;t fine-tune models yourself, leveraging these pre-trained specialists reduces hallucinations.</p><h2>10. Use Human Feedback</h2><p>The most sophisticated approach involves Reinforcement Learning from Human Feedback. Humans review outputs, flag errors, and suggest corrections. The model learns from these mistakes like an apprentice learning from a mentor. You can implement simpler versions by letting users report incorrect answers. This long-term approach makes the system better over time while other techniques catch immediate errors.</p><p>Preventing hallucinations isn&#8217;t about one magic technique. It&#8217;s about layering multiple strategies that work together. Each layer adds protection. Some hallucinations slip past prompting but get caught by verification. Others get prevented entirely by retrieval augmentation.</p><p>The stakes are real. AI systems increasingly handle tasks where accuracy matters deeply. Medical advice, legal guidance, customer support, and educational content all require truthfulness. By understanding how hallucinations happen and how to prevent them, we can build AI that people can actually trust. The technology keeps improving, but the fundamental principles remain: be clear about expectations, provide real information when possible, verify outputs, and keep learning from mistakes.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.diamant-ai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading <strong>&#128142;DiamantAI!</strong> I share cutting-edge AI insights, tutorials, and breakthroughs. Subscribe for free to get new posts delivered straight to your inbox, and as a bonus, you&#8217;ll receive a <strong>33% discount coupon</strong> for both of my digital books, <em><a href="https://nirdiamant.gumroad.com/l/rag-made-simple">RAG Made Simple</a></em> and <em><a href="https://nirdiamant.gumroad.com/l/mtxrfk">Prompt Engineering: From Zero to Hero</a></em>. Enjoy!</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p></p>]]></content:encoded></item><item><title><![CDATA[This Simple Trick Makes AI agents Far More Reliable (By Making It Argue With Itself)]]></title><description><![CDATA[6-minute read]]></description><link>https://newsletter.diamant-ai.com/p/this-simple-trick-makes-ai-agents</link><guid isPermaLink="false">https://newsletter.diamant-ai.com/p/this-simple-trick-makes-ai-agents</guid><dc:creator><![CDATA[Nir Diamant]]></dc:creator><pubDate>Mon, 29 Sep 2025 16:00:46 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!5IdF!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdba32792-ab0c-46bf-bf0d-37b7b43acbd0_1024x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>6-minute read</em></p><p>AI has gotten remarkably good at reasoning through problems step-by-step, searching the web for current information, and doing internal deliberation before responding. But researchers discovered something intriguing: even with all these improvements, AI systems can get dramatically better at finding correct answers by debating with copies of themselves.</p><p>Think about how you approach a really important decision. You might research the topic and think through the pros and cons. But for crucial choices, you probably also talk it through with trusted friends or colleagues. Each person brings different perspectives, catches things you missed, and helps you refine your thinking.</p><p>That&#8217;s exactly what multiagent debate does for AI systems.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!5IdF!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdba32792-ab0c-46bf-bf0d-37b7b43acbd0_1024x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!5IdF!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdba32792-ab0c-46bf-bf0d-37b7b43acbd0_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!5IdF!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdba32792-ab0c-46bf-bf0d-37b7b43acbd0_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!5IdF!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdba32792-ab0c-46bf-bf0d-37b7b43acbd0_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!5IdF!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdba32792-ab0c-46bf-bf0d-37b7b43acbd0_1024x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!5IdF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdba32792-ab0c-46bf-bf0d-37b7b43acbd0_1024x1024.png" width="1024" height="1024" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/dba32792-ab0c-46bf-bf0d-37b7b43acbd0_1024x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1024,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1914956,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://diamantai.substack.com/i/174847951?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdba32792-ab0c-46bf-bf0d-37b7b43acbd0_1024x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!5IdF!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdba32792-ab0c-46bf-bf0d-37b7b43acbd0_1024x1024.png 424w, https://substackcdn.com/image/fetch/$s_!5IdF!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdba32792-ab0c-46bf-bf0d-37b7b43acbd0_1024x1024.png 848w, https://substackcdn.com/image/fetch/$s_!5IdF!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdba32792-ab0c-46bf-bf0d-37b7b43acbd0_1024x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!5IdF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdba32792-ab0c-46bf-bf0d-37b7b43acbd0_1024x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.diamant-ai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.diamant-ai.com/subscribe?"><span>Subscribe now</span></a></p><h2>Why Single Perspectives Have Limitations</h2><p>Today&#8217;s AI systems use chain-of-thought prompting to show their work step-by-step, advanced reasoning models that pause to think internally, and web search to ground responses in real information. These techniques work well, but they share one limitation: they&#8217;re fundamentally single-perspective approaches.</p><p>Consider a complex math problem where the AI needs to choose between several solution approaches. Chain-of-thought prompting helps the AI work through its chosen method carefully, but it might still pick the wrong approach from the start. Web search won&#8217;t help because the problem isn&#8217;t about missing facts.</p><p>This is where multiagent debate adds value. Multiple AI copies might initially choose different solution approaches. As they examine each other&#8217;s work, they can identify not just calculation errors but fundamental flaws in reasoning strategy.</p><h2>How Multiagent Debate Works</h2><p>The multiagent debate process starts after other reasoning techniques have already been applied. Each AI agent might use chain-of-thought reasoning or access search results. Then they compare their conclusions and reasoning processes.</p><p>The agents don&#8217;t just look at each other&#8217;s final answers. They examine each other&#8217;s complete reasoning chains, identify specific errors or gaps, and use those insights to improve their own work. If one agent makes a calculation error, another can point it out specifically. If one misinterprets information, another can offer a different reading.</p><p>AI systems readily incorporate improvements when presented with better evidence or reasoning, which makes this collaborative process particularly effective.</p><h2>How Disagreement Reveals Uncertainty</h2><p>When multiple AI copies produce different answers to the same question, that disagreement often signals genuine ambiguity or complexity in the problem. Traditional single-agent AI might confidently state one answer, even when the underlying question is genuinely uncertain.</p><p>For factual questions where agents initially disagree, the debate often eliminates the most questionable claims while preserving well-supported information. Facts that appear consistently across multiple reasoning chains are more likely to be accurate than isolated claims.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.diamant-ai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.diamant-ai.com/subscribe?"><span>Subscribe now</span></a></p><h2>The Three-Phase Enhancement Process</h2><p>Multiagent debate follows a structured pattern that maximizes learning while maintaining efficiency. The process works as an overlay on existing AI capabilities rather than replacing them.</p><p>In the independent reasoning phase, each agent tackles the problem using whatever methods work best - chain-of-thought, web search, specialized tools, or advanced reasoning techniques. This ensures diverse initial approaches and prevents premature convergence.</p><p>During the cross-examination phase, agents review each other&#8217;s complete reasoning processes, not just conclusions. They look for logical gaps, factual errors, better solution approaches, and missed considerations. This isn&#8217;t passive review but active analysis and criticism.</p><p>The revision phase allows agents to update their work based on insights gained from examining other responses. They might correct errors, adopt better reasoning strategies, or synthesize the strongest elements from multiple approaches.</p><h2>Performance Improvements Across Domains</h2><p>Testing shows that multiagent debate consistently improves performance across different domains, even when baseline AI systems already use advanced reasoning techniques. Mathematical problems, factual questions, and strategic reasoning tasks all showed meaningful accuracy gains when debate was added.</p><p>Debate also reduced hallucinations and confident incorrect statements. The collaborative process helped identify and eliminate questionable claims that individual agents might have stated with false confidence, leading to more reliable final answers.</p><p>Perhaps most impressively, researchers found cases where all agents initially provided incorrect answers but converged on the correct solution through debate. The collective reasoning process can overcome individual errors in ways that other enhancement techniques cannot.</p><h2>Best Use Cases for Debate</h2><p>Multiagent debate makes most sense for high-stakes decisions where accuracy is crucial and computational cost is secondary. Medical diagnosis systems could use debate to catch overlooked symptoms or alternative diagnoses. Financial analysis benefits from multiple perspectives on market data and risk assessment. Legal research could employ debate to ensure comprehensive case analysis.</p><p>The technique also works well for complex reasoning tasks where even advanced AI might miss subtle logical flaws. Scientific hypothesis evaluation, strategic planning, and policy analysis all involve multi-faceted reasoning where debate adds value.</p><h2>Computational Costs vs Benefits</h2><p>Multiagent debate requires running multiple AI instances through several rounds of interaction. A single question effectively becomes multiple questions, which increases computational expense.</p><p>Organizations can implement debate selectively, using it for their most important queries while maintaining faster single-agent responses for everyday tasks. The technique becomes more cost-effective as AI computation gets cheaper and more accessible.</p><h2>What This Means for AI Development</h2><p>Multiagent debate addresses a limitation that individual enhancement methods can&#8217;t solve alone: the need for genuinely independent perspectives on complex problems. Even the most advanced reasoning model is still fundamentally one mind working through a problem.</p><p>This suggests that future AI reliability improvements might come from orchestrating multiple AI minds to work together effectively. As these systems become more powerful, techniques for collaborative reasoning could be as important as advancing individual capabilities.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.diamant-ai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading <strong>&#128142;DiamantAI!</strong> I share cutting-edge AI insights, tutorials, and breakthroughs. Subscribe for free to get new posts delivered straight to your inbox, and as a bonus, you&#8217;ll receive a <strong>33% discount coupon</strong> for both of my digital books, <em><a href="https://nirdiamant.gumroad.com/l/rag-made-simple">RAG Made Simple</a></em> and <em><a href="https://nirdiamant.gumroad.com/l/mtxrfk">Prompt Engineering: From Zero to Hero</a></em>. Enjoy!</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p></p>]]></content:encoded></item><item><title><![CDATA[How to Choose Your AI Agent Framework ]]></title><description><![CDATA[A Builder's Guide]]></description><link>https://newsletter.diamant-ai.com/p/how-to-choose-your-ai-agent-framework</link><guid isPermaLink="false">https://newsletter.diamant-ai.com/p/how-to-choose-your-ai-agent-framework</guid><dc:creator><![CDATA[Nir Diamant]]></dc:creator><pubDate>Sun, 07 Sep 2025 14:09:24 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!F0Uk!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36e8ada3-9c1a-45b1-b7bf-09a438d5bfe8_1444x961.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>About 10 min read</em></p><p>Imagine walking into a hardware store to buy tools for your first big project. The shelves stretch endlessly, each tool promising to make your job easier. Some look simple and familiar. Others seem complex but powerful. How do you choose?</p><p>This is exactly what building AI agents feels like today. The landscape of frameworks has exploded, each claiming to be the perfect solution. But here's the thing: they're all right, just for different jobs. Picking the wrong framework is like trying to build a house with only a screwdriver. You might eventually succeed, but you'll waste time and energy along the way.</p><p>AI agents are more than chatbots. They're digital workers that can plan, use tools, remember things, and solve problems on their own. Think of them as skilled assistants who can research topics, schedule meetings, write reports, or analyze data without constant supervision. The framework you choose determines how easily you can build and manage these digital workers.</p><p>Let's explore the major players in this space and understand what makes each one special. By the end, you'll know exactly which tool fits your project.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!nnPs!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dbe4c1f-233a-460f-9156-44bac2cd0393_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!nnPs!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dbe4c1f-233a-460f-9156-44bac2cd0393_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!nnPs!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dbe4c1f-233a-460f-9156-44bac2cd0393_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!nnPs!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dbe4c1f-233a-460f-9156-44bac2cd0393_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!nnPs!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dbe4c1f-233a-460f-9156-44bac2cd0393_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!nnPs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dbe4c1f-233a-460f-9156-44bac2cd0393_1536x1024.png" width="1456" height="971" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2dbe4c1f-233a-460f-9156-44bac2cd0393_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:3078963,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://diamantai.substack.com/i/173012394?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dbe4c1f-233a-460f-9156-44bac2cd0393_1536x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!nnPs!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dbe4c1f-233a-460f-9156-44bac2cd0393_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!nnPs!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dbe4c1f-233a-460f-9156-44bac2cd0393_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!nnPs!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dbe4c1f-233a-460f-9156-44bac2cd0393_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!nnPs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2dbe4c1f-233a-460f-9156-44bac2cd0393_1536x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.diamant-ai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.diamant-ai.com/subscribe?"><span>Subscribe now</span></a></p><h2>LangGraph: Graph-Based Control</h2><p>LangGraph represents a fundamental shift in how we think about agent workflows. Instead of linear chains where one action follows another, LangGraph models agent behavior as a graph. Information flows between nodes, loops back when needed, and branches based on conditions. It's like the difference between a shopping list and a flowchart that adapts to what's actually in the store.</p><p>What makes LangGraph powerful is its approach to state management. Based on Google's Pregel system for processing massive graphs, it treats your agent's workflow as a series of "super-steps" where nodes can execute in parallel and pass messages to each other. Every step can read and modify a shared state, with the framework tracking all changes automatically.</p><p>The checkpointing system is particularly sophisticated. LangGraph saves your agent's state at every step, allowing you to pause, resume, or even rewind workflows. If something goes wrong, you can examine exactly what the state was at any point. For production systems where reliability matters, this is invaluable. You can literally time-travel through your agent's execution history.</p><p>LangGraph works equally well in Python and TypeScript, with complete feature parity between languages. But here's the key: LangGraph deliberately stays low-level. It provides powerful primitives rather than pre-built solutions. You get fine-grained control over how state updates combine, how errors propagate, and how different parts of your system coordinate. This means more power but also more responsibility.</p><p>The framework excels at complex coordination patterns. Building a customer service system that escalates to humans based on sentiment? LangGraph's conditional edges make it natural. Creating an AI researcher that refines its approach based on findings? The cyclic graph structure handles it elegantly. Need to coordinate multiple agents working on subtasks? The message-passing architecture was built for this.</p><p>Choose LangGraph when you need production-grade reliability and precise control over agent behavior. It's ideal for teams building sophisticated systems where predictability and debuggability matter more than quick prototyping.</p><h2>Google AI SDK: Enterprise Multimodal Power</h2><p>Google brings something different to the AI agent landscape: an entire ecosystem designed for enterprise-scale multimodal agents. The Google AI SDK isn't just another way to access language models. It's a comprehensive platform that treats images, audio, video, and text as first-class citizens.</p><p>The multimodal capabilities are native, not bolted on. While other frameworks process different media types separately and stitch results together, Google's Gemini models understand everything simultaneously. Your agent can watch a video while reading related documents and listening to audio commentary, forming one unified understanding. This isn't multiple models cooperating; it's one model that truly sees, reads, and hears.</p><p>Google's Agent Development Kit takes a structural approach to multi-agent systems. Instead of having agents communicate through natural language (which can be unreliable), it provides purpose-built coordinators. Sequential agents hand off work in order. Parallel agents divide tasks automatically. Loop agents iterate until conditions are met. The coordination happens through architecture, not prompting.</p><p>The enterprise integration runs deeper than any competitor. If your company uses Google Workspace, agents naturally access documents, spreadsheets, and calendars. Running on Google Cloud? They query BigQuery, read from Cloud Storage, and trigger Cloud Functions without complex authentication. The framework includes over 100 pre-built enterprise connectors, plus integration with Apigee's ecosystem of 800,000+ managed APIs.</p><p>What's particularly innovative is the reasoning transparency. Google's latest models can expose their thinking process in a structured way, showing not just what they decided but why. This "thought signature" persists across conversations, helping you understand and debug agent behavior. For regulated industries requiring explainable AI, this transparency is essential.</p><p>The framework also addresses enterprise data concerns uniquely. Agents can work with data residing in AlloyDB, BigQuery, or NetApp without copying it elsewhere. This "data residency" approach means sensitive information never leaves approved systems, addressing compliance requirements that stop many AI projects before they start.</p><p>Choose Google AI SDK when you need true multimodal processing, enterprise-grade integration, or when you're already invested in Google Cloud. It's particularly strong for companies requiring their AI agents to work within existing infrastructure and compliance boundaries.</p><h2>Multi-Agent Orchestration: CrewAI and AG2</h2><p>Some problems are too big for one AI to handle alone. Enter CrewAI and AG2, frameworks designed for AI teamwork. Instead of one super-capable agent, they let you create specialized agents that collaborate.</p><p>CrewAI treats AI agents like a film crew. You have a director, a cinematographer, a script writer, each with specific expertise. When you ask for a movie review, the researcher agent finds information, the critic agent analyzes it, and the writer agent crafts the final review. They pass information back and forth, building on each other's work.</p><p>AG2 takes a similar but slightly different approach. It's more like a group chat where experts discuss a problem. One agent proposes a solution, another critiques it, a third suggests improvements. They keep talking until they reach a good answer. This framework emerged from Microsoft's research and focuses on flexible, message-based communication between agents.</p><p>These multi-agent frameworks excel at complex tasks that benefit from different perspectives. Writing a business plan? Have one agent handle market research, another financial projections, and a third competitive analysis. The agents work in parallel, making the process faster and more thorough than a single agent switching between tasks.</p><p>The challenge is coordination. Like managing a real team, you need clear roles and communication rules. Too many agents can create chaos. Too few might miss important angles. You become less of a programmer and more of a manager, designing workflows and interactions.</p><p>Pick CrewAI when you want a higher-level approach with built-in coordination patterns. Choose AG2 if you prefer more control over how agents communicate and want to experiment with novel interaction patterns.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.diamant-ai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.diamant-ai.com/subscribe?"><span>Subscribe now</span></a></p><h2>Pydantic AI: Deep Python Integration</h2><p>Pydantic AI brings something unique to AI development: the same type safety and validation that makes modern Python reliable. Yes, many frameworks now offer structured outputs through OpenAI or other providers. But Pydantic AI goes much deeper than just getting JSON in the right shape.</p><p>The framework's core innovation is its dependency injection system. Your AI agents often need to access databases, call APIs, or use configuration settings. Pydantic AI lets you inject these dependencies with full type safety, similar to how FastAPI handles web requests. The AI agent gets exactly what it needs, when it needs it, with compile-time checking that catches errors before runtime.</p><p>What sets it apart is automatic self-correction. When the AI produces invalid output, Pydantic AI doesn't just fail. It automatically explains to the AI what went wrong and asks it to try again. The framework can pass validation errors back to the model with specific instructions on how to fix them. Your agent literally learns from its mistakes in real-time.</p><p>The streaming validation is particularly sophisticated. While the AI is still generating its response, Pydantic AI validates partial outputs. It can detect that a response will be invalid before it's complete, saving processing time and allowing for early correction. This creates tighter feedback loops and more efficient processing.</p><p>For Python developers, the framework feels natural. If you've used FastAPI, SQLAlchemy, or Django, the patterns are immediately familiar. Type hints aren't just documentation; they actively shape how your agent behaves. Your IDE understands everything, autocomplete works perfectly, and refactoring is safe. The validation happens at multiple levels: during development, at runtime, and even during the AI's generation process.</p><p>The error handling goes beyond simple validation. Pydantic AI provides a hierarchy of exception types, preserves error context throughout the execution, and can automatically convert technical validation errors into natural language feedback the AI can understand and act upon.</p><p>Choose Pydantic AI when you need deep Python ecosystem integration, sophisticated error handling, and want AI development to feel like regular software engineering. It's perfect for teams that value type safety and need their AI agents to integrate seamlessly with existing Python codebases.</p><h2>Performance First: Agno</h2><p>Agno (formerly Phidata) is the sports car of AI frameworks: lean, fast, and focused on performance. While others added features, Agno obsessed over speed and efficiency.</p><p>The difference shows in practice. The same agent that takes seconds and gigabytes of memory in other frameworks runs faster and lighter in Agno. This matters when you're running multiple agents, working with limited resources, or need quick responses.</p><p>Despite being lightweight, Agno doesn't skimp on features. It handles tools, memory, and knowledge bases elegantly. Adding a custom tool takes minimal setup. Connecting to documents for context is straightforward. It even supports images and audio, not just text.</p><p>The philosophy is different too. Where other frameworks add abstraction layers, Agno stays close to regular Python code. You write functions, Agno makes them available to AI. You define workflows, Agno runs them efficiently. There's less magic, more mechanics.</p><p>Agno appeals to developers who found other frameworks too heavy or slow. It's ideal for production systems where performance matters, or when you want AI capabilities without learning a whole new system. If you're adding AI to an existing application and need it to be fast and unobtrusive, Agno delivers.</p><h2>TypeScript Native: Mastra</h2><p>Mastra brings AI agents to JavaScript and TypeScript developers, but that's not what makes it special. LangGraph also supports TypeScript. The difference is philosophy: where LangGraph gives you powerful primitives to build with, Mastra gives you a complete, opinionated toolkit where the hard decisions are already made.</p><p>Built by the team behind Gatsby.js, Mastra understands what developers actually need to ship products. The framework includes a visual playground for watching agents think in real-time, debugging workflows step by step, and testing without deploying. It's like having specialized developer tools built specifically for AI agents.</p><p>The integration system sets Mastra apart. Point it at any API, and it generates a type-safe package complete with TypeScript types, parameter validation, and automatic tool conversion. Connect to Stripe? Your agent can now process payments. Add GitHub? It can manage repositories. These aren't just HTTP calls; they're fully typed, validated integrations that feel native to your application.</p><p>What's unique is the batteries-included approach. Observability comes built-in with OpenTelemetry, not as an afterthought. Deployment to Vercel, Cloudflare, or Netlify takes one command. Every agent automatically gets OpenAPI documentation and a Swagger interface. Where other frameworks make you choose and integrate separate tools, Mastra made the choices for you based on production experience.</p><p>The workflow system feels natural to JavaScript developers. Chain steps with familiar patterns, branch based on conditions, run parallel tasks. It uses battle-tested state machine concepts under the hood, but you don't need to know that. The abstractions match how JavaScript developers already think about async operations.</p><p>Mastra includes development niceties that others miss. It integrates with modern IDEs through MCP (Model Context Protocol), providing real-time documentation and suggestions. The framework auto-generates TypeScript types for everything, making refactoring safe and autocomplete helpful.</p><p>Choose Mastra when you want to ship AI features fast without building infrastructure. It's perfect for startups needing AI capabilities yesterday, or enterprise teams valuing developer productivity over infinite customization. The framework is opinionated in the best way: the opinions come from building and running production systems.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.diamant-ai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.diamant-ai.com/subscribe?"><span>Subscribe now</span></a></p><h2>No-Code Workflows: n8n</h2><p>n8n breaks the pattern entirely. Instead of writing code, you draw workflows. It's like the difference between assembling furniture from scratch versus using modular pieces that snap together.</p><p>n8n started as a workflow automation tool, connecting hundreds of services through a visual interface. When AI agents arrived, n8n added them as another component you can drop into workflows. Now you can build complex automations that include AI decision-making without writing code.</p><p>The visual approach democratizes AI agents. A marketing manager can build a workflow that monitors social media, uses AI to identify important mentions, and sends alerts. An operations analyst can create a system that reads reports, extracts insights with AI, and updates dashboards. No programming required.</p><p>This accessibility comes with trade-offs. Complex agent behaviors or fine-tuned prompts are harder to implement visually than in code. But for many use cases, especially those involving integration between services, n8n's approach is faster and more maintainable than traditional development.</p><p>n8n excels when AI is part of a larger automation, not the entire solution. If you need to connect AI to your existing tools and workflows, especially if you're not a programmer, n8n provides the fastest path to results.</p><h2>Choosing Your Framework</h2><p>Each framework evolved to solve different problems. There's no universal best choice, only the right choice for your situation.</p><p>Consider your constraints first. What programming language does your team know? How important is performance versus features? Do you need production-grade deployment or just a prototype? Will one AI agent suffice or do you need a team?</p><p>Think about your users too. Who will maintain this system? How critical is reliability? What happens if the AI produces unexpected output? The answers guide you toward frameworks that match your needs.</p><p>Remember that these aren't permanent decisions. Many teams prototype with one framework and rebuild with another once they understand their requirements. The ecosystem is young and evolving rapidly. New frameworks appear regularly, and existing ones add features constantly.</p><p>The explosion of AI agent frameworks mirrors the early days of web frameworks or mobile development. Eventually, patterns will emerge and consolidate. But right now, we're in the experimental phase where different approaches compete and innovate.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!F0Uk!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36e8ada3-9c1a-45b1-b7bf-09a438d5bfe8_1444x961.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!F0Uk!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36e8ada3-9c1a-45b1-b7bf-09a438d5bfe8_1444x961.png 424w, https://substackcdn.com/image/fetch/$s_!F0Uk!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36e8ada3-9c1a-45b1-b7bf-09a438d5bfe8_1444x961.png 848w, https://substackcdn.com/image/fetch/$s_!F0Uk!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36e8ada3-9c1a-45b1-b7bf-09a438d5bfe8_1444x961.png 1272w, https://substackcdn.com/image/fetch/$s_!F0Uk!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36e8ada3-9c1a-45b1-b7bf-09a438d5bfe8_1444x961.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!F0Uk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36e8ada3-9c1a-45b1-b7bf-09a438d5bfe8_1444x961.png" width="1444" height="961" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/36e8ada3-9c1a-45b1-b7bf-09a438d5bfe8_1444x961.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:961,&quot;width&quot;:1444,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:126648,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://diamantai.substack.com/i/173012394?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36e8ada3-9c1a-45b1-b7bf-09a438d5bfe8_1444x961.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!F0Uk!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36e8ada3-9c1a-45b1-b7bf-09a438d5bfe8_1444x961.png 424w, https://substackcdn.com/image/fetch/$s_!F0Uk!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36e8ada3-9c1a-45b1-b7bf-09a438d5bfe8_1444x961.png 848w, https://substackcdn.com/image/fetch/$s_!F0Uk!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36e8ada3-9c1a-45b1-b7bf-09a438d5bfe8_1444x961.png 1272w, https://substackcdn.com/image/fetch/$s_!F0Uk!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36e8ada3-9c1a-45b1-b7bf-09a438d5bfe8_1444x961.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.diamant-ai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading <strong>&#128142;DiamantAI!</strong> I share cutting-edge AI insights, tutorials, and breakthroughs. Subscribe for free to get new posts delivered straight to your inbox, and as a bonus, you&#8217;ll receive a <strong>33% discount coupon</strong> for both of my digital books, <em><a href="https://nirdiamant.gumroad.com/l/rag-made-simple">RAG Made Simple</a></em> and <em><a href="https://nirdiamant.gumroad.com/l/mtxrfk">Prompt Engineering: From Zero to Hero</a></em>. Enjoy!</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p></p>]]></content:encoded></item><item><title><![CDATA[The AI Arms Race Is Over. Smart Engineering Won]]></title><description><![CDATA[How Smart Engineering Is Replacing Brute Force Scaling]]></description><link>https://newsletter.diamant-ai.com/p/ais-scaling-era-just-ended-whats</link><guid isPermaLink="false">https://newsletter.diamant-ai.com/p/ais-scaling-era-just-ended-whats</guid><dc:creator><![CDATA[Nir Diamant]]></dc:creator><pubDate>Mon, 11 Aug 2025 13:20:58 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Y4wT!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feeb6fce8-d0f5-45ba-b2e5-283140b9e9c7_1536x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>4-5 minute read</em></p><p>The release of GPT-5 got me thinking about where AI is heading. While it's an improvement, the jump isn't as dramatic as previous generations. This pattern is appearing across the industry, signaling that simply building bigger models is no longer delivering the breakthroughs we're used to.</p><p>I'm writing this because we're entering the most exciting phase of AI development yet - one that will require completely new approaches beyond just scaling up.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Y4wT!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feeb6fce8-d0f5-45ba-b2e5-283140b9e9c7_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Y4wT!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feeb6fce8-d0f5-45ba-b2e5-283140b9e9c7_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!Y4wT!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feeb6fce8-d0f5-45ba-b2e5-283140b9e9c7_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!Y4wT!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feeb6fce8-d0f5-45ba-b2e5-283140b9e9c7_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!Y4wT!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feeb6fce8-d0f5-45ba-b2e5-283140b9e9c7_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Y4wT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feeb6fce8-d0f5-45ba-b2e5-283140b9e9c7_1536x1024.png" width="1456" height="971" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/eeb6fce8-d0f5-45ba-b2e5-283140b9e9c7_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:3175631,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://diamantai.substack.com/i/170681089?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feeb6fce8-d0f5-45ba-b2e5-283140b9e9c7_1536x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Y4wT!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feeb6fce8-d0f5-45ba-b2e5-283140b9e9c7_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!Y4wT!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feeb6fce8-d0f5-45ba-b2e5-283140b9e9c7_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!Y4wT!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feeb6fce8-d0f5-45ba-b2e5-283140b9e9c7_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!Y4wT!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Feeb6fce8-d0f5-45ba-b2e5-283140b9e9c7_1536x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.diamant-ai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.diamant-ai.com/subscribe?"><span>Subscribe now</span></a></p><h2>The Scaling Method Is Failing</h2><p>For ten years, the recipe for AI breakthroughs was simple: make models bigger and train them longer. GPT-3 amazed us by writing human-like essays. GPT-4 solved test questions and understood pictures. Each jump felt massive.</p><p>But that's changing. GPT-4 was much better than GPT-3, but newer models show much smaller improvements. Other AI companies report the same pattern. Adding more parameters and data no longer creates the dramatic leaps we're used to.</p><p>This doesn't mean AI progress stopped - it means we've hit the limits of our current approach. Even the biggest advocates of scaling now admit we need completely new ideas to reach the next level.</p><h2>Smart Engineering - Maximizing Current AI</h2><p>If we can't just make models bigger forever, how can we make current AI work better? The good news is that we can make today's AI much more useful with clever tricks.</p><p>For example, instead of trying to make one model remember everything, we can connect it to databases or the internet. This way, it can look up current information when it needs it. We can also teach AI to break down hard problems into smaller steps, just like humans do. This often gives better answers than trying to solve everything at once.</p><p>Engineers are also making AI handle different types of information at the same time, like text and pictures together. They're also increasing how much information the AI can work with at once. These aren't completely new technologies - they're smart ways to use what we already have better.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://github.com/NirDiamant/agents-towards-production&quot;,&quot;text&quot;:&quot;Learn to build smarter AI agents&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://github.com/NirDiamant/agents-towards-production"><span>Learn to build smarter AI agents</span></a></p><h2>Data and Computing Limits</h2><p>The scaling approach is hitting two concrete walls. First, we've used most of the high-quality text on the internet. What remains is low-quality or repetitive. Training AI on AI-generated content creates error loops that make models worse.</p><p>Second, the computing costs are exploding. Making a model slightly better now requires exponentially more processing power and electricity. This quickly becomes too expensive and environmentally unsustainable.</p><h2>New AI Architectures Needed</h2><p>The type of AI design we use now (called "Transformers") has worked very well. But it also has some basic problems. Models like GPT work by guessing what word comes next in a sentence. This makes them very good at copying patterns from their training data, but it doesn't mean they truly understand what the words mean.</p><p>No matter how big we make these models, they might still fail at tasks that need real reasoning or understanding. This is why many researchers think that just making the same type of AI bigger won't give us human-like intelligence.</p><p>To break through this barrier, we probably need completely new ways to build AI. Some ideas include:</p><ul><li><p>AI that learns by interacting with the real world (not just reading text)</p></li><li><p>AI systems with special parts for memory and reasoning</p></li><li><p>AI that can truly understand cause and effect</p></li></ul><p>These new ideas are still being tested, but they might be the key to the next big jump in AI ability.</p><h2>Building AI That Self-Corrects</h2><p>Another important area is making AI reason better and double-check its own answers. Today's AI can solve complex problems, but it often needs us to tell it how to think step by step.</p><p>For example, if we ask an AI to "think step by step," it will show us its reasoning process and usually give a better answer. This shows that AI can reason, but it doesn't always do it unless we specifically ask.</p><p>Researchers have also found that having one AI check another AI's work can catch mistakes and improve results. The goal is to give AI an "inner voice" that can notice when something might be wrong.</p><p>In the future, we want AI that can say "Wait, that answer doesn't look right, let me try again." If we can build AI that checks and improves its own thinking, it will be much more reliable and work more like human problem-solving.</p><h2>AGI - Hype vs Reality</h2><p>Many people think that just making current AI bigger will eventually create artificial general intelligence (AGI) - AI that can do anything a human can do. But this probably isn't true.</p><p>Real general intelligence likely needs abilities that current AI doesn't have, such as:</p><ul><li><p>Learning completely new tasks by itself</p></li><li><p>Setting its own goals</p></li><li><p>Understanding the physical world like humans do</p></li></ul><p>Current AI models don't really do these things. So while each new model might be somewhat better, it won't suddenly become a thinking machine with human-like common sense.</p><p>Getting to AGI will probably require major scientific breakthroughs and careful work to make sure it's safe. It's not something that will happen very soon just by making models bigger.</p><h2>The New Era of AI Innovation</h2><p>The scaling slowdown isn't a problem - it's an opportunity. When one approach reaches its limits, researchers diversify and innovate. We're now seeing investment in multiple promising directions: better architectures, self-correcting systems, reasoning capabilities, and novel training methods.</p><p>Future AI progress will be more varied and sophisticated than simply making bigger models. The path to human-like AI is still being built, and we're moving forward on multiple fronts simultaneously.<br></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.diamant-ai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading <strong>&#128142;DiamantAI!</strong> I share cutting-edge AI insights, tutorials, and breakthroughs. Subscribe for free to get new posts delivered straight to your inbox, and as a bonus, you&#8217;ll receive a <strong>33% discount coupon</strong> for both of my digital books, <em><a href="https://nirdiamant.gumroad.com/l/rag-made-simple">RAG Made Simple</a></em> and <em><a href="https://nirdiamant.gumroad.com/l/mtxrfk">Prompt Engineering: From Zero to Hero</a></em>. Enjoy!</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Why Reasoning Models Are Broken in Production (And How to Fix Them)]]></title><description><![CDATA[How to Cut AI Costs 60% While Boosting Quality in Production]]></description><link>https://newsletter.diamant-ai.com/p/why-reasoning-models-are-broken-in</link><guid isPermaLink="false">https://newsletter.diamant-ai.com/p/why-reasoning-models-are-broken-in</guid><dc:creator><![CDATA[Nir Diamant]]></dc:creator><pubDate>Sun, 03 Aug 2025 13:31:33 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!pF5S!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa85e0392-51e1-4482-b992-ffb9adcb340a_1536x1024.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>Estimated reading time: 8 minutes</em></p><p><strong>Before we dive in:</strong> Quick update on the <a href="https://github.com/NirDiamant/agents-towards-production">Agents Towards Production repo</a> - we've hit 9K stars and 30+ tutorials in just one month, covering everything developers need to build production-ready agents. The community response has been incredible. Now I'm expanding partnerships with companies building AI infrastructure - vector databases, embeddings APIs, real-time search, orchestration layers, observability platforms, GPU hosting, security tools and many more. We're creating real, hands-on tutorials together (not marketing fluff) that show how to integrate their tools as modular components developers can pick and choose from when building agents. If you know teams who'd value authentic developer adoption through quality educational content, <a href="https://www.diamant-ai.com/">connect me</a>. The goal remains the same: give developers a complete toolbox for production agents.</p><div><hr></div><p>Picture this: you're running a hospital emergency room. A patient walks in with chest pain. Do you immediately call in the heart surgeon, or do you first have a nurse do a quick assessment? The nurse can handle most cases perfectly well and costs a fraction of what the surgeon charges. But when someone truly needs that specialized expertise, you want the best available.</p><p>This is exactly the challenge facing anyone deploying reasoning models in production today. These new AI systems can think through complex problems step by step, often taking considerably longer but delivering dramatically better results on hard questions. The traditional models respond in seconds but struggle with multi-step logic.</p><p>The production challenge lies in knowing when each expensive reasoning cycle is worth the cost and delay</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!pF5S!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa85e0392-51e1-4482-b992-ffb9adcb340a_1536x1024.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!pF5S!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa85e0392-51e1-4482-b992-ffb9adcb340a_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!pF5S!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa85e0392-51e1-4482-b992-ffb9adcb340a_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!pF5S!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa85e0392-51e1-4482-b992-ffb9adcb340a_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!pF5S!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa85e0392-51e1-4482-b992-ffb9adcb340a_1536x1024.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!pF5S!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa85e0392-51e1-4482-b992-ffb9adcb340a_1536x1024.png" width="1456" height="971" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a85e0392-51e1-4482-b992-ffb9adcb340a_1536x1024.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:971,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:3371474,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://diamantai.substack.com/i/169998420?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa85e0392-51e1-4482-b992-ffb9adcb340a_1536x1024.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!pF5S!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa85e0392-51e1-4482-b992-ffb9adcb340a_1536x1024.png 424w, https://substackcdn.com/image/fetch/$s_!pF5S!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa85e0392-51e1-4482-b992-ffb9adcb340a_1536x1024.png 848w, https://substackcdn.com/image/fetch/$s_!pF5S!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa85e0392-51e1-4482-b992-ffb9adcb340a_1536x1024.png 1272w, https://substackcdn.com/image/fetch/$s_!pF5S!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa85e0392-51e1-4482-b992-ffb9adcb340a_1536x1024.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.diamant-ai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.diamant-ai.com/subscribe?"><span>Subscribe now</span></a></p><h2><strong>Traditional vs Reasoning Models</strong></h2><p>Traditional AI models read your question and immediately start generating their answer. Sometimes they get it right through sheer knowledge. Other times, especially on multi-step problems, they stumble.</p><p>Reasoning models are different. They can pause, think through the problem, try different approaches, and build up to their final answer. The result is often dramatically more accurate on complex questions that require multiple steps of logic.</p><p>Consider asking both types: "Take this sentence about a dragon slayer, then create an acronym from the first letter of each word." A traditional model might write a sentence and then forget about the second part entirely. A reasoning model will carefully execute each step, making sure to actually extract those first letters and combine them.</p><p>But that thinking time costs money and patience. The reasoning model might take significantly longer and cost five times more per query. For a simple query like "What's the capital of France?" you're paying premium prices for unnecessary deliberation.</p><h2><strong>Production Routing Strategy</strong></h2><p>The production solution isn't to pick one model or the other. It's to build an intelligent dispatcher that routes each request to the appropriate model based on complexity.</p><p>Your routing layer looks at incoming questions and makes split-second decisions about complexity. Simple factual questions go to the baseline model. Multi-step reasoning problems, complex analysis requests, or anything involving detailed problem-solving gets routed to the reasoning specialist.</p><p>The production benefit is clear: most users get lightning-fast responses from the efficient model. Only truly complex questions incur the delay and cost of deep reasoning. The result is dramatically better quality on hard questions while keeping costs reasonable and most responses fast.</p><h2><strong>Complexity Detection in Production</strong></h2><p>How does an AI system recognize a hard question when it sees one? Several clues can tip it off.</p><p><strong>Length and Structure</strong>: A question with multiple parts ("Do this calculation, then explain why, then suggest three alternatives") is almost certainly complex. Questions with words like "analyze," "compare," "step-by-step," or "prove" signal multi-step thinking ahead.</p><p><strong>Domain Signals</strong>: Math word problems, requests to write and debug something, questions asking for detailed analysis, or anything requiring synthesis from multiple sources typically need the reasoning specialist.</p><p><strong>Uncertainty Indicators</strong>: Sometimes the system tries the quick model first. If that model seems uncertain in its response or expresses low confidence, the system can automatically escalate to the reasoning specialist for a second opinion.</p><p><strong>Retrieval Complexity</strong>: When your system searches for information to answer a question, the results themselves provide clues. If the search returns conflicting information or no clear answer emerges, the question likely needs more sophisticated handling.</p><p>Some systems even ask the quick model to rate its own confidence. If it says something like "I'm not entirely sure about this," that triggers an automatic escalation to the more powerful model.</p><h2><strong>Reasoning Model Cost Economics</strong></h2><p>The cost difference between standard and reasoning models is substantial in production. Reasoning models often cost 3-5 times more per query than standard models. They also consume more "thinking tokens" as they work through problems internally. A query that might cost a few cents with a standard model could cost 15-20 cents with a reasoning model.</p><p>But production routing makes this economical. If you can accurately identify which 20% of queries truly need the premium treatment, you can serve them with the reasoning model while handling the other 80% efficiently. Companies report achieving near reasoning-model quality at roughly half the cost of using reasoning models exclusively.</p><p>The production key is getting routing accuracy high enough that few complex queries slip through to get poor answers from the baseline model.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://newsletter.diamant-ai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://newsletter.diamant-ai.com/subscribe?"><span>Subscribe now</span></a></p><h2><strong>Production Architecture Design</strong></h2><p>The production architecture consists of three main components: the router, the baseline model service, and the reasoning model service.</p><p>The baseline model handles most production traffic efficiently. It's powered by a capable but economical model that can handle straightforward queries, basic explanations, simple calculations, and routine requests. Response times stay quick, and costs remain minimal.</p><p>The reasoning model handles complex cases in production. This is where the reasoning specialist operates, equipped with the ability to break down complex problems, use multiple tools if needed, and think through multi-step solutions. It takes much longer but delivers much higher quality on difficult queries.</p><p>The router component sits between users and these two services in your production system. It analyzes each incoming query using complexity signals and makes a routing decision quickly, forwarding the request to the appropriate model.</p><p>The key production insight is that both models share the same interface from the user's perspective. Someone asking a question doesn't need to know which model answered it. They just get an appropriate response in a reasonable time.</p><h2><strong>When Reasoning Models Aren't Worth It</strong></h2><p>But reasoning models aren't the right production choice for every problem. Sometimes, the smartest approach is avoiding them entirely.</p><p><strong>Ultra-Fast Applications</strong>: If you're building autocomplete or real-time suggestions, even modest delays are too slow. Users expect instant responses, so you'd stick with the fastest models available or simpler algorithmic approaches.</p><p><strong>High-Volume, Low-Margin Services</strong>: If you handle millions of queries daily and earn very little per interaction, even modest per-query costs can destroy profitability. Better training or retrieval systems for a single efficient model might be more cost-effective.</p><p><strong>Deterministic Problems</strong>: If someone asks for the 10th number in the Fibonacci sequence, just calculate it directly rather than asking an AI to reason through it. Many problems that look like they need AI reasoning actually have simpler, more reliable solutions.</p><p><strong>High-Stakes Decisions</strong>: Sometimes the stakes are too high for reasoning models. In medical, legal, or safety-critical applications, you might want more predictable, auditable decision-making processes rather than AI reasoning chains that are harder to verify.</p><p>The production rule: use reasoning models to augment human intelligence on genuinely complex problems, not to replace well-understood processes or handle simple tasks inefficiently.</p><h2><strong>Production Safety and Monitoring</strong></h2><p>Deploying reasoning models in production requires careful attention to what could go wrong. These models are powerful but not infallible, and production environments demand reliability.</p><p><strong>Timeout Protection</strong>: Reasoning models can sometimes get stuck in long thought loops, especially on very complex or poorly formed queries. Production systems need hard timeouts that prevent users from waiting indefinitely. If a reasoning process takes too long, fall back to a simpler answer or a polite "let me get back to you" message.</p><p><strong>Output Validation</strong>: Because reasoning models generate longer, more complex responses, they have more opportunities to include problematic content. Production systems often include checks to ensure responses meet expected formats and don't contain inappropriate material.</p><p><strong>Tool Usage Limits</strong>: Many reasoning models can use tools like web search, calculators, or databases. Each tool access point needs proper security controls in production. Unlimited access to external systems poses the same risks as giving unrestricted permissions to any automated process.</p><p><strong>Cost Monitoring</strong>: With variable costs per query, production monitoring becomes crucial. Systems need alerts if costs suddenly spike, which might indicate a problem with the routing logic or an influx of unexpectedly complex queries.</p><p><strong>Performance Tracking</strong>: Production deployments require monitoring response times, error rates, and user satisfaction scores. Reasoning models can occasionally produce correct but overly verbose answers, so tracking user engagement helps calibrate the system.</p><p>The production goal is building systems that can harness the power of advanced reasoning while maintaining predictable, safe operation at scale.</p><h2><strong>Production User Experience</strong></h2><p>When implemented well, this two-tier approach creates an AI system that feels both fast and intelligent.</p><p>Most interactions feel snappy because straightforward queries get handled immediately by the efficient model. When someone poses a genuinely complex problem, the system smoothly shifts into deeper thinking mode. The user might see a message like "Let me think through this carefully..." followed by a much more thoughtful, accurate response.</p><p>Production transparency helps too. Users understand that complex questions take more time, and the key is ensuring the wait delivers notably better answers.</p><p>Some production systems show progress indicators during reasoning processes, similar to "Analyzing documents..." or "Checking multiple sources..." This helps users understand the process and builds confidence in the system.</p><h2><strong>Production Evolution Trends</strong></h2><p>The reasoning era is still in its early production days. Current routing systems are relatively simple, but they're evolving rapidly toward more sophisticated decision-making in production environments.</p><p>Future production systems might maintain user profiles to learn individual complexity preferences. Someone who frequently asks technical questions might have their threshold adjusted to route more borderline cases to reasoning models. A user who typically wants quick answers might have the opposite bias.</p><p>We're seeing production experiments with multi-tier cascades rather than just two options. Instead of "baseline" and "reasoning," production systems might have "instant," "quick," "thoughtful," and "expert" levels, each optimized for different complexity ranges and cost constraints.</p><p>The reasoning models themselves continue improving for production use. Today's reasoning models might be tomorrow's baseline models in terms of capability, though likely not in terms of speed. As this happens, production routing decisions will need constant recalibration.</p><p>Production teams are also exploring dynamic pricing models where complex queries cost more, helping offset the higher computational costs while maintaining service accessibility for simple requests.</p><h2><strong>Production Implementation Guide</strong></h2><p>If you're considering deploying reasoning models in production, start small and measure everything. Begin with a clear test set of queries where you know which answers are correct. Run both model types on these queries to establish baseline performance differences.</p><p>Focus on getting the routing logic right before optimizing for speed or cost. A router that's 90% accurate at identifying complex queries will deliver most of the benefits. Trying to push that to 99% might not be worth the additional complexity.</p><p>Monitor user satisfaction alongside technical metrics. Sometimes a reasoning model produces a technically correct but overly verbose answer when a simple response would have been better. User feedback helps calibrate these trade-offs.</p><p>Budget for iteration. Your first routing thresholds won't be perfect, and user behavior might change over time. Plan to revisit and retune the system regularly based on real usage patterns.</p><p>Most importantly, remember that the production goal isn't to use the most advanced AI possible on every query. It's to deliver the right level of intelligence for each specific need, as efficiently as possible.</p><p>The reasoning era isn't about replacing human thinking with AI thinking. It's about creating production AI systems smart enough to know when they need to think harder. In that sense, it represents not just more powerful AI, but more thoughtful AI deployed at scale. And perhaps that's the most important production advancement of all.<br></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://newsletter.diamant-ai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading <strong>&#128142;DiamantAI!</strong> I share cutting-edge AI insights, tutorials, and breakthroughs. Subscribe for free to get new posts delivered straight to your inbox, and as a bonus, you&#8217;ll receive a <strong>33% discount coupon</strong> for both of my digital books, <em><a href="https://nirdiamant.gumroad.com/l/rag-made-simple">RAG Made Simple</a></em> and <em><a href="https://nirdiamant.gumroad.com/l/mtxrfk">Prompt Engineering: From Zero to Hero</a></em>. Enjoy!</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item></channel></rss>