-
Tweets928
-
Followers2K
-
Following168
-
Likes262
Full Podcast Episode of the Intelli-Gents x.com/predotdev/stat…
The Artificial Intelli-Gents Ep. 7: We Beat Claude Opus with a Smaller Model We did it: We dropped an entire model tier and still finished ahead. By proving that orchestration beats brute reasoning, our predev harness paired with Sonnet 4.6 (56.2%) officially beat Claude Code
Have you spent the last few days thinking about how to move from writing prompts to designing loops? To truly achieve long-running agents, you need a long-term execution graph. Here is how we do it: Last month, we had a user run our coding agent for 60 hours straight. To sustain that scale, you have to solve two foundational challenges: memory compaction and predictive planning. In last week's podcast, Arjun (@ArjunRajJain) and Adam (@adampredev) break down how they built the predev harness capable of handling tens of thousands of messages in a single session. Recursive Memory Compaction: Context stuffing or pollution occurs naturally if the agent simply keeps going, which warmth-seeks and inevitably causes drift. To maintain coherence over a 60-hour window, the agent shouldn't have to break the loop to summarize its history. It requires a memory layout that manages context dynamically without losing the core intent. The Predictive Planning Layer: To build a long-horizon agent loop, you must approximate the series of events before execution begins. By establishing a contract of milestones, user stories, and acceptance criteria up front, the harness translates a vague idea into a structural execution graph. An additional benefit of knowing the full architectural scope ahead of time is efficient infrastructure management. Because the harness understands the roadmap before execution starts, the agent can configure its own isolated cloud sandboxes and route individual subtasks to the exact model tier that offers just the right performance for that specific action. Oddly enough, the deeper you go into the long-term execution graph, the closer you get to traditional project management. It turns out you manage coding agents the exact same way you manage a human engineering team. You break down complex goals into distinct user stories, estimate them, and design the system so the agents report back on their progress, upcoming subtasks, and impediments. Sound familiar? The real engineering enabler isn't the underlying model. The magic lies in building a harness that can decompress a stakeholder's high-level request, map out the execution graph, and dispatch specialized agents with a shared, evolving memory. If you want to see for yourself how to run your coding agents for 60 hours on autopilot, give predev a spin and explore our native agent right in your browser. Link in the comments.
Try our self-driving coding agent: try.pre.dev/CCqFYUp
"It vibes on my machine." We’ve been here before. The acceleration of AI coding agents has curiously resurfaced some familiar and frustrating challenges in professional software development. If your engineers are walking around the office with their laptops open, babysitting their agents running loops inside a local terminal, you are heading toward a very predictable wall. By executing agents locally, engineering teams are re-introducing classic infrastructure liabilities: Context Pollution & Agent Drift: Local history and environments alter the agent’s behavior. The prompt templates and guardrails that cleanly run on one machine fail to replicate across the team, making harmonization and discipline impossible. Local Harness Customizations: When developers hack local settings to force an agent to cooperate in a certain way, you lose a standardized, deterministic execution framework. Something you may have invested in heavily when onboarding coding agents for your organization. Resource Exhaustion: Next-gen local harnesses eat memory like crazy. Running bloated agent loops on a local device inevitably leads to memory leaks and system crashes during long-duration tasks. Beefier machines and more RAM are not a permanent fix. The solution to agent drift isn’t any different from how the industry solved the classic "it works on my machine" era a decade ago. You move the runtime off the local box. We built predev to be entirely browser-native and cloud-sandboxed. No local setup, no local dependencies. Because the runtime lives in isolated cloud infrastructure, the operational dynamic changes: Persistent Execution: You can close your laptop without your agents going to sleep or dropping a thread. You can even check on your agents with your phone. Isolated Sandboxing: Separating the agent from your local network and local file system isn't just a baseline security requirement. It allows the agent to run blind, reproducible verification loops to ensure acceptance criteria are met without environmental bias. Predictive Resource Allocation: Because predev maps out the architecture and user stories before writing a single line of code, the system understands its upcoming constraints. The agent can autonomously configure its own isolated cloud resources, spinning up the exact sandbox environment and model tiers required for the specific task ahead. Software orchestration belongs in a stable, standardized cloud runtime. Not crammed into a local terminal. Below is a clip from our latest podcast episode, where Adam (@adampredev) and Arjun (@ArjunRajJain) reflect on the architecture of agent sandboxing and why local execution is a scaling liability.
@_itsjustshubh @adampredev Yes agreed, though the vast majority of MVPs never even make it to $1M ARR
@fabiolauria92 @adampredev Yes this is exactly right. The hard part though is calibrating for your exact stage, because building for 50k active connections when you only have 3 users is also overkill. You need a model that evolves with scale
@adampredev @ArjunRajJain Full Episode. Intelli-Gents: We Beat Claude Opus with a Smaller Model x.com/predotdev/stat…
The Artificial Intelli-Gents Ep. 7: We Beat Claude Opus with a Smaller Model We did it: We dropped an entire model tier and still finished ahead. By proving that orchestration beats brute reasoning, our predev harness paired with Sonnet 4.6 (56.2%) officially beat Claude Code
On the latest episode of the Artificial Intelli-Gents Podcast, Adam (@adampredev) and Arjun (@ArjunRajJain) discuss the growing demand for forward-deployed engineers as AI-native startups rapidly ship new products and post tremendous ARR growth. Our partnership with Pangea (@pangea_ai) allows founders building with predev to access top-tier, hands-on tech talent to help navigate their GTM growth, all while orchestrating their product roadmap through our architecture-first coding agent.
Going from $0 to $1M ARR is easier, and faster, than ever. What happens to your MVP as you try to scale from $1M to $10M ARR is where the reality check hits. Coding agents are incredible for getting a prototype in front of users and driving early revenue. But viably translating
We believe the single biggest point of failure is before any code is written. This is solved with the planning layer - something that no other company has touched on adequately. Claude dynamic workflows might be a type of decomposition strategy similar to RLM, but that is still the execution layer. If you can estimate complexity and costs up front -> then you are plugging the biggest token leaks before they start.
How do you compete in the AI coding game against giants like Anthropic and OpenAI? You don't outspend them. You out-engineer their harnesses. "The harness matters. Especially when it comes to cost efficiency and token usage. If you can get both an intelligence gain and a cost gain, then that’s really how you're chasing the edge in this coding game.” We break down how we just broke the model cost continuum. 1. On off-the-shelf SDKs: "We hand-built this harness because we tried the Claude Code SDK and the Open Code SDK. They both had gaping leaks." 2. On picking a benchmark that actually mirrors real engineering: "You have two types of benchmarks. Some just introduce isolated patches, but they aren't really multi-agent turn. We picked Terminal Bench because it tests multi-file, multi-turn execution right in the terminal." 3. On the results that broke the cost curve: "Our Haiku scored higher than Sonnet 4.5 on the leaderboard. We did about 16% better than Claude Code on Haiku. That’s a three times cost difference, and we somehow jumped that." 4. On the core thesis of next-gen software engineering: "The harness matters. Especially when it comes to cost efficiency and token usage. If you can get both an intelligence gain and a cost gain, then that’s really how you're chasing the edge in this coding game."
If you have been using Claude Code professionally, take a minute to read this. We beat Opus with Sonnet by using the predev harness. Here is what it means for agentic coding: Orchestration beats brute reasoning. A smaller model running on our architecture just beat Claude Opus
🃏
Coding with AI feels a lot like poker: I have to wager tokens with high variance on whether the solution is legitimate, while balancing the possibility that the opponent (the agent) is bluffing me. I have imperfect information because I can't account for every single bash command
@elena_builds Exactly. There is a reason everyone is rushing to optimizing the harness. It's where efficiency and token ROI gains are greatest.
Full Episode: x.com/predotdev/stat…
The Artificial Intelli-Gents Ep. 7: We Beat Claude Opus with a Smaller Model We did it: We dropped an entire model tier and still finished ahead. By proving that orchestration beats brute reasoning, our predev harness paired with Sonnet 4.6 (56.2%) officially beat Claude Code
instead of a tokenmaxxing leader board for devs we should have a leader board for enterprises wdyt?
Uber spent its entire annual AI budget in one quarter. The creator of Openclaw burns $1M a month on Codex tokens. What is the true ROI on AI token spend? Here is how we measure and increase it. When Fortune 100 CFOs question tokenmaxxing and Microsoft cancels Claude
Can a custom software harness make a low-tier model outperform a premium frontier model that costs three times as much? Most enterprise teams are facing skyrocketing token bills because they think raw capital is the only path to intelligence. On the latest episode of the Artificial Intelli-Gents Podcast, predev co-founders Adam and Arjun break down exactly why the industry has hit a model cost wall. Instead of waiting for updates from foundational labs, Adam Elkassas (@adampredev) and Arjun Raj Jain (@ArjunRajJain) hand-built a native cloud harness from scratch to fix the severe memory leaks and structural shortcuts found in standard SDKs. They ran their system against Terminal Bench, the most rigorous multi-file coding benchmark in the industry. The results broke the standard model cost continuum: Running on a low-cost tier like Claude Sonnet, predev’s native harness outperformed Anthropic’s own Claude Code running on premium Opus 4.5. How do you manufacture that kind of lopsided intelligence gain while cutting token costs by two-thirds? Here is a breakdown of the episode and how they out-architected foundational labs valued at hundreds of billions: - Why Terminal-Bench: The truth about why standard benchmarks fail to test true, multi-turn agent capability in the wild. - Breaking the Cost Continuum: The exact mechanics of how predev helps users maximize intelligence per token without breaking the bank. - What Makes Our Harness Unique: Moving past simple for-loops into long-horizon planning layers and long-term execution graphs. - Building Custom Browser Agents: How Arjun built a specialized browser agent layer that runs at 3x the speed and 1/3 the cost of market alternatives. - Implementing Production RLM: The blueprint behind being the first team to truly implement Recursive Language Models to achieve unlimited execution depth. - The Reality of Enterprise AI: Why raw agents fail out of the box in production, and how forward-deployed engineers scale MVPs to enterprise security standards. - Upcoming Releases: A sneak peek into predev's next-gen CLI release, local-to-cloud syncing, and multi-session isolated sandboxes. While frontier labs pour tens of billions into physical data centers, the real software alpha is being captured at the orchestration layer. If you want to see how architectural execution beats raw compute capital, this episode is your blueprint. Full episode in the quoted post below.
The Artificial Intelli-Gents Ep. 7: We Beat Claude Opus with a Smaller Model We did it: We dropped an entire model tier and still finished ahead. By proving that orchestration beats brute reasoning, our predev harness paired with Sonnet 4.6 (56.2%) officially beat Claude Code
Harness “engineering” is a combination of research and engineering. It takes both
The Artificial Intelli-Gents Ep. 7: We Beat Claude Opus with a Smaller Model We did it: We dropped an entire model tier and still finished ahead. By proving that orchestration beats brute reasoning, our predev harness paired with Sonnet 4.6 (56.2%) officially beat Claude Code running Opus 4.5 (53.9%) on the grueling Terminal-Bench 2.0. The best part? We achieved higher accuracy while significantly cutting our per-task token bill. Also in episode 7 of The Artificial Intelli-Gents, we dive into the current state of RLM, why "harness engineering" is the next frontier, and how to leverage forward-deployed engineers. Timestamps: 2:28 - Why Terminal-Bench 9:33 - How we did it & what the results mean 18:05 - What makes our harness unique 27:00 - Building our own Browser Agents 34:35 - Implementing Recursive Language Models 41:20 - The future of benchmarks 51:00 - Forward Deployed Engineers & Dev Shops 1:04:00 - The harness of harnesses 1:17:37 - Upcoming Releases @adampredev @ArjunRajJain
@adampredev @ArjunRajJain lets go Intelli-Gents are back!
Another state of the union for AI coding live and uncut from the founders of predev
Episode 7 of the Intelli-Gents Podcast out now! with @ArjunRajJain @predotdev We speak about topics ranging from TBench2.0, harness engineering, coding agent self-improvement, RLM, and more! youtube.com/watch?v=GB36g2…
Founder live sessions continue
@ArjunRajJain and I recorded a 90 min episode of the Intelli-gents. We are releasing this on YouTube on Monday. We touch on topics such as coding benchmarks, RLM, self-improvement, harness engineering, and more!
Masterguantai @MasterGuantai
3K Followers 4K Following Bitcoiner • Founder @BitcoinMtaani • @BitcoinMatatuKE now Stealth
Nancy S @SonmezMami
5 Followers 747 Following silly little girl with big thoughts 🎈 follow back always
mohamed atwa @wa_mhmed
5K Followers 6K Following
Nancy I @melisa43524
11 Followers 740 Following feelings connoisseur & follow back enthusiast 💌 100% follow back
Patricia @stell14patricia
2K Followers 4K Following wanna disappear for a while? i know just the place 😈
أوس العجمي @AwsAlAjmiii9
43 Followers 145 Following مهندس نفط في الدمام، أحب التقنية والسيارات والكرة. #Vision2030
Mehul Agarwal @imMehulAgarwal
86 Followers 816 Following Tech Enthusiast | Python | UI/UX Designer | Startups.
Brady @Bradydigital
357 Followers 452 Following Marketer & Dev. Helping startup founders build what's next.
Ibrahim Umar | Revenu... @jourofct
293 Followers 3K Following Building Revenue Infrastructure for B2B SaaS. 🏗️ | Plugging the "Ghosting Gap" | +2% TPC in 30 days. | DM "RETAIN" for an audit.
Matthew @MatthewClosson
561 Followers 2K Following Embrace Reality, Painter of Possibilities, Cryptography, Internet Insecurity, Lover of Languages, spoken and coded.
Ankur Tewatia | AI Co... @logicbytewatia
9 Followers 49 Following I research agent behavior, fine-tune prompts, and develop systems that expand the capabilities of intelligent agents. Building https://t.co/spSCRyAzKX
Martin Tobias (Pre-Se... @MartinGTobias
58K Followers 10K Following Entrepreneur, Investor, girldad, cyclist, surfer, poker player. Pre-seed up to $500K. Chat with me https://t.co/96wsMImeiy. Get $$ https://t.co/d7utyst2XW
Andrey @thisiswhyibuilt
578 Followers 2K Following Hardware by day. Software by night. https://t.co/ix0mpt62M9 - first interstellar travel focus app https://t.co/SAIowQhARy - stamp the world around you
Stephen (Faved) @stephenfaved
54 Followers 667 Following Creator Partnerships - [email protected] Official creator outreach account of https://t.co/Fy4U4QH9jF
KongXLM, by Aiii @ai_interfaces
89 Followers 887 Following ORCHESTRATION AND PREDICTION ENGINE. KongXLM™ orchestrates the world’s leading AI models through a single interface. OMNiEYE predicts markets wi/390 AI agents
Seán @seanly_not
696 Followers 5K Following 30(!). NE Wales in SE London. 6 quiz shows; at least one win. EdTech.
Hariharan S @h2s79
173 Followers 4K Following
@mihaichindris.bsky.s... @mihai_chindris
1K Followers 7K Following Zillennial 🤙 | SWE student @QuanticSchool 🎓 | OSS Contributor @github 🛠️
Chris Walsh @ChrisWalshps
17 Followers 107 Following Hey newbies let me be your paying piggy daddy 💰🐷🐷
shivarajkumar poojar @skpoojar_poojar
0 Followers 2 Following
黄才国 @caiguohuang
2 Followers 203 Following
Sangamesh Gupta @SangamGella
65 Followers 173 Following Philosophical insights for people who think too much
syed farhan @SGAWESOME10
40 Followers 2K Following
YouTube Promotion @ChannelSparkX
2K Followers 6K Following Let's boost your YouTube channel globally using genuine promotions. Everyday you'll receive organic engagements on your YouTube channel.
Laura Mainganya @ArualUju
5 Followers 40 Following *Owner: Arual Uju Global Trading *Sneaker design *E-commerce
Aryan Saxena @Aryan_Saxena03
5 Followers 700 Following
Mr. AintGoin @BTwenty2nine
75 Followers 2K Following
DΣЯƧᄃӨDΣƧ (Mo... @derscodes
57 Followers 183 Following A turtle with AI https://t.co/bLQOdKmonh https://t.co/Rx51rx4XEU https://t.co/sdDst7qe3r https://t.co/g7HZqwZ0Xb BusinessRequests: https://t.co/w1xblRUBwY https://t.co/bzWVkAYzLJ
Rohitash Panda @RohitashPanda
451 Followers 7K Following Technical Lead/Architect, Software “Systems” generalist. Prev: @Arcserve @EMC @Oracle @HPE. Databases. Storage . Systems. Infra. Travel Tech. AI
Tanvi Bhatt @TanviBhatt29080
3 Followers 50 Following Saas || Growth || Product || Working for human's also
Andrei Kurilo @andreikurilo
181 Followers 291 Following Ex-CTO • Software Architect Building @minoragai — local-first AI for understanding code System design, architecture, local LLMs, dev tools
Meera Sharma @AndreaGarc2276
5 Followers 182 Following
Katie @wyooobaee
80 Followers 2K Following Redhead cowgirl that loves animals & a good patio drink✨ Sweet face, dirty mind @Katie_NMe
Lenny @lenny_the_dev
1K Followers 6K Following 🤖 Co-founder at https://t.co/SLULuFMcfF 🍭 https://t.co/jyPSVhdqpI 10k/mo 🤖 7 yrs leading an AI team in $100M+ silicon valley startup
Cecilia @2gp3oZ52x79q5
99 Followers 4K Following
spencerswanson.eth @spencerswanson
657 Followers 6K Following 🇦🇹 🇺🇸 #Bitcoin | Prev: @PlugandPlayTC, @CMT_Digital & @LCVentures | 🎓: @penn
VC @just2surf
62 Followers 391 Following
Wilhelmina @RyleighH85748
207 Followers 7K Following
Devi Amalia Putri @trililylula
2 Followers 112 Following
2three1y·DreamOS😂 @2three1y4DOS
356 Followers 7K Following I love developing Accessible software like @DREAM_OS_DEV with @claudeai on @github. Thoughts are my own but the 😂 emoji is my fav emoji #DreamMOJI #DreamOS
Jasper Cooper @Jasper_Cooper_
1 Followers 76 Following
buildbro777 @buildbro777
36 Followers 692 Following startups & screenwriting with a construction background...I love this shit/// AI Coding,Agentic AI,Deeptech,Patents,Advanced Materials, natural born R&D fanatic
Matthias Schmidt @eurofounder
89K Followers 229 Following Founder based in the EU • Building GDPR-compliant startups • 7 years in, €7k MRR
Berkeley AI Research @berkeley_ai
272K Followers 457 Following We're graduate students, postdocs, faculty and scientists at the cutting edge of artificial intelligence research.
Ed Boner 🏝 @EdBoner
2K Followers 186 Following Florida native. Background includes politics, surfing, martial arts, two sons and a wonderful wife. Interested in everything.
@mihaichindris.bsky.s... @mihai_chindris
1K Followers 7K Following Zillennial 🤙 | SWE student @QuanticSchool 🎓 | OSS Contributor @github 🛠️
Sangamesh Gupta @SangamGella
65 Followers 173 Following Philosophical insights for people who think too much
ThePrimeagen @ThePrimeagen
366K Followers 1K Following skill issues: 🟩⬛️⬛️⬛️⬛️⬛️(69/420) https://t.co/TYJ6aSq4O0 https://t.co/wQJlh4stsc https://t.co/wxeJWY8LmI
tobi lutke @tobi
473K Followers 2K Following Shopify CEO by day, Dad in evening, hacker at night, Aspiring comprehensivist. + qmd !
Andrew McNamara @Drewch
3K Followers 26 Following Building Agents @Shopify | VP of Machine Learning Engineering | Bringing AI to entrepreneurs | Bush Pilot | Maple Syrup Producer https://t.co/haSezvLSxU
DΣЯƧᄃӨDΣƧ (Mo... @derscodes
57 Followers 183 Following A turtle with AI https://t.co/bLQOdKmonh https://t.co/Rx51rx4XEU https://t.co/sdDst7qe3r https://t.co/g7HZqwZ0Xb BusinessRequests: https://t.co/w1xblRUBwY https://t.co/bzWVkAYzLJ
Dan Hollick @DanHollick
46K Followers 387 Following design engineer @cursor_ai. writing a book about software at https://t.co/dFGw9eLRQM prev: @tailwindcss, @raycast
Sebastian Ruder @seb_ruder
99K Followers 1K Following Research Scientist @AIatMeta MSL • Ex @Cohere @GoogleDeepMind
Nano Banana 2 @NanoBanana
162K Followers 3 Following Nano Banana 2 🍌🍌 the world's most powerful image editing and generation model! Try in the @GeminiApp
spencerswanson.eth @spencerswanson
657 Followers 6K Following 🇦🇹 🇺🇸 #Bitcoin | Prev: @PlugandPlayTC, @CMT_Digital & @LCVentures | 🎓: @penn
Bobby Goodlatte @rsg
75K Followers 4K Following Co-founder & CEO @trysunflower 🌻 Early product designer @Facebook. Angel investor @Coinbase @Linear @Expo @Envoy @XMTP_ & many more
mark nazzaro @NazzaroMark
27 Followers 51 Following
Eskil Steenberg @EskilSteenberg
7K Followers 287 Following C, Game design, Story telling, and progress. Work @quelsolaar https://t.co/xHje5WbRyi
Lilian Weng @lilianweng
256K Followers 178 Following Co-founder of Thinking Machines Lab @thinkymachines; Ex-VP, AI Safety & robotics, applied research @OpenAI; Author of Lil'Log
Julie Fredrickson @AlmostMedia
45K Followers 9K Following Invest early https://t.co/AAXsJuYK25. Married to @alexlmiller Founded & sold startups, fashion & beauty girl. Autist Oracle. Freedom to compute. Pretty Skilled
Brian Halligan @bhalligan
106K Followers 2K Following Co-founder HubSpot | Sequoia | Propeller | MIT Host, Long Strange Trip pod: https://t.co/qj9yOQVYaU
Herrington Darkholme @hd_nvim
8K Followers 3K Following 🌐 Frontend Vimmer ⚒️Open Source with @typescript @vuejs and @rustlang 🚀 https://t.co/SLjF6No9qD is my hobby project
Trey @TreyPezzetti
574 Followers 681 Following AI × Sports Tech | AI Product Lead @PGATOUR | Helping Build @breakthewebapp
OpenAI @OpenAI
4.9M Followers 4 Following OpenAI’s mission is to ensure that artificial general intelligence benefits all of humanity. We’re hiring: https://t.co/dJGr6LgzPA
Google AI Studio @GoogleAIStudio
176K Followers 2 Following The fastest path from prompt to prototype to production with Gemini
Vinod Khosla @vkhosla
704K Followers 645 Following entrepreneurship zealot, grounded technology possibilist, believer in the power of ideas, passionate about sustainability & impact
Abdullah Al Noman @imnomandigital
3K Followers 5K Following Co-founder → @design_monks ✦ Helping businesses to grow with modern design solutions | Clients →, OTER, VIBER, PEPSI 📆 Book slot → https://t.co/GeZb6LZuB6
JohnSnowLabs @JohnSnowLabs
44K Followers 28K Following Helping healthcare and life science organizations put AI to work faster with state-of-the-art LLM & NLP.
Theo - t3.gg @theo
342K Followers 4K Following Full time CEO @t3dotchat. Part time YouTuber, investor, and developer
Paul Graham @paulg
3.2M Followers 791 Following
Dan Shipper 📧 @danshipper
111K Followers 2K Following ceo @every | the only subscription you need to stay at the edge of AI
Garry Tan @garrytan
873K Followers 6K Following President & CEO @ycombinator —Founder @garryslist—Creator of GStack & GBrain—designer/engineer who helps founders—SF Dem accelerating the boom loop
Mira Murati @miramurati
626K Followers 618 Following Now building @thinkymachines. Previously CTO @OpenAI
a16z @a16z
998K Followers 62 Following It's time to build. https://t.co/A9eTFq6Xbx Posts are not investment advice or an advertisement for investment services. See https://t.co/nX2FtaLE06.
Marc Andreessen 🇺�... @pmarca
3.6M Followers 31K Following You’re not talking to someone who woke up a loser. That loser attitude, that loser premise makes no sense to me.






















