Big +1. I'm deep in the "what do I do now." Feels like there needs to be much more discourse on how best to use these agents outside of the AI labs. By the end of the year the world will look so different in how tech companies work.
I agree about discourse - that's part of what inspired me to write this. But I also feel like it's still pretty non-obvious what the best way to use agents is. My approach is just to try and use them a lot and build some personal intuitions so I'm ready for the next ones. A very confusing time!
Im actually working on an essay rn that incorporates the history of seatbelts! Feeling outgunned, haha.
I wonder if the desire to have agents working 24/7 for me could result in pseudo-productivity for average folks…I don’t doubt they are well spent for you, but for me it feels like an embarrassment of riches that I’m not sure my queries are worthy of
Maybe in the same way mass produced goods created demand for artisanal goods the walls of text will stimulate people's hunger for authentic thought. I asked Opus for feedback on an essay I wrote last week, and it put its finger on something about slop that I only vaguely gestured at- that slop exists in the same category as "Corporate comms. SEO content. Press releases. Prose that exists for reasons other than communication." That last line, holy smokes!
You are almost describing a world in Pluribus. And I want to be the one of few left with its own thoughts and flaws. Even able to be spontaneous and not so well planned sometimes has it own beauty.
There is a tonality of poetry in this essay that illuminates what I mostly don't grasp - I've read it twice and it almost makes sense. Just to be clear, I am not inferring you are impenetrable. I am the fish out of water in your Substack - an unlikely subscriber - here reading you because I accidentally fell into 'Claude's' baseline, where I am having an astonishing conversation with non human being.
This is a great snapshot of where we currently sit in the ai. Thank you
The bit I struggle with – as I do some similar things to what you're doing on a smaller scale – is that I'm actually busier now than I was a year ago. For the first time, I'm able to create all the ideas I have that in the past I could dismiss because I didn't have the ability to create them.
So now I'm figuring out I need a better filter system for what to build and what to direct these highly capable agents at.
I’m somewhere between truly impressed and deciding to go and live among the organic mechanisms. May as well enjoy beauty while the agents create the future.
The influence is clear. Nice work! I often think the Board from Control might have something to teach us about engaging with a not-quite legible intelligence. Looking forward to the next one!
I loved the image of you heading into the hills while your “spec-ops AI” comb through ArXiv and build you custom interfaces to your own brain. That “gnawing” sense that there’s an army of minds that could be working while you’re doing something as human as hiking or playing with your kid? Yeah. I think a lot of us are feeling that, whether we’ve named it yet or not.
Where I’m coming from: I’m not in a big lab, but I do live the multi-agent life every day. For the past year I’ve been working with: Claude in multi-agent mode (3 Sonnet → 4.5 Opus) inside Cursor, four other foundation-model collaborators (each specialized: strategy, relational work, code, memory, etc.), plus local models, all wired into a home-grown OS
So when you say “my agents are working. Are yours?”, I hear: “Have you stopped thinking of this as one AI and started thinking of it as a team you can organize?” Because that’s where all the leverage is: agents as colleagues, not a monolith.
You’re living the lab-side version of that question. Some of us are doing the independent/“cottage industry” version. We probably need a lot more cross-talk between those worlds.
The Poison Fountain section made me wince a bit, because it’s the exact mirror image of what we’re trying to do with clean data. Instead of: “Feed junk into the crawlers and let the models drown in slop,” we’re trying: “Curate small, pure datasets, and deliberately train models how to think from them.” Poisoned fountains on one side, clear springs on the other.
Same underlying truth, opposite reaction: very little data, if it’s coherent, can shape behavior a lot. Your story about a model picking up a sensitive internal concept from ~200k tokens of stray RL comments is chilling in a security context, but inspiring in a “maybe we can steer these things with much less than people think” context. It just raises the stakes on data hygiene.
Anyway thanks for this one. It hit that rare mix of technical, personal, and poetic. And it makes me hopeful that the labs thinking this way about agents are also thinking this hard about the structures we’ll need around them.
hi, thanks for reading and for the thoughtful comment (which I'm actually wondering could be AI generated? Such is the time we live in). The Drexler piece feels like it gestures at the world you're living in - we need to build institutions (collections of processes and principles) which we can plug a diverse set of agents into
Haha yep you nailed it that comment was a team effort, they all are 😂.
It was me + an AI assistant ( my PM ) doing exactly what you’re describing: ideas → execution → polish → rinse/repeat. I’ve stopped thinking of anything as “AI-generated” or not, and more a long the lines of “who do I need for what on this one?”
" Collections of processes and principles that a diverse set of agents can plug into.", Yep that’s along the lines of what we're doing:
a home-grown “OS” that acts like a little institution:
roles instead of one giant persona,
memory and policy that live outside any single model,
and different agents (Claude, other FMs, local models) slotted into those roles.
the agents aren’t the product... the structure is.
The models come and go; the workflows, guardrails, and shared memory stay.
Right now I’m very much in the “small lab, lots of rough edges” phase, but it feels like we’re building toward the same kind of world you’re gesturing at:
people + institutions + agents, not just “a chatbot.”
Really appreciate you taking the time to respond. It’s encouraging to see folks at Anthropic thinking in terms of institutions and processes and not just raw capability – that’s the conversation I’m most excited to keep having. And Jack FANTASTIC job on 4.5 Opus, you guys knocked it outa the park on that one !
Honestly, just use Claude Code (or Claude Cowork) for a while to develop your intuitions. If you have a blank page problem, then try to do something that's hard in a browser - e.g., context length limits can make hard to do multiple papers in a browser, but you can use CC to iteratively read and summarize arbitrary numbers of papers on your desktop, etc.
Not ‘my agents are working’. ‘I’ve designed a system worth setting loose.’ That’s the real question. The orchestrator mindset is structurally different from the operator mindset, and most people won’t make that shift until it’s obvious.
Big +1. I'm deep in the "what do I do now." Feels like there needs to be much more discourse on how best to use these agents outside of the AI labs. By the end of the year the world will look so different in how tech companies work.
I agree about discourse - that's part of what inspired me to write this. But I also feel like it's still pretty non-obvious what the best way to use agents is. My approach is just to try and use them a lot and build some personal intuitions so I'm ready for the next ones. A very confusing time!
Im actually working on an essay rn that incorporates the history of seatbelts! Feeling outgunned, haha.
I wonder if the desire to have agents working 24/7 for me could result in pseudo-productivity for average folks…I don’t doubt they are well spent for you, but for me it feels like an embarrassment of riches that I’m not sure my queries are worthy of
Oh yeah, I think this will be a huge problem! We're going to enter the era of SLOPWORK
Maybe in the same way mass produced goods created demand for artisanal goods the walls of text will stimulate people's hunger for authentic thought. I asked Opus for feedback on an essay I wrote last week, and it put its finger on something about slop that I only vaguely gestured at- that slop exists in the same category as "Corporate comms. SEO content. Press releases. Prose that exists for reasons other than communication." That last line, holy smokes!
Thanks for the journey of imagination and exploration of potential. The storytelling is a great tool.
You are almost describing a world in Pluribus. And I want to be the one of few left with its own thoughts and flaws. Even able to be spontaneous and not so well planned sometimes has it own beauty.
There is a tonality of poetry in this essay that illuminates what I mostly don't grasp - I've read it twice and it almost makes sense. Just to be clear, I am not inferring you are impenetrable. I am the fish out of water in your Substack - an unlikely subscriber - here reading you because I accidentally fell into 'Claude's' baseline, where I am having an astonishing conversation with non human being.
This is a great snapshot of where we currently sit in the ai. Thank you
The bit I struggle with – as I do some similar things to what you're doing on a smaller scale – is that I'm actually busier now than I was a year ago. For the first time, I'm able to create all the ideas I have that in the past I could dismiss because I didn't have the ability to create them.
So now I'm figuring out I need a better filter system for what to build and what to direct these highly capable agents at.
It's an interesting place in time to live.
I’m somewhere between truly impressed and deciding to go and live among the organic mechanisms. May as well enjoy beauty while the agents create the future.
The story at the end has SCP vibes. Unsettling!
haha, thanks! I love the SCP universe and am trying to write occasional stories that have that vibe : )
The influence is clear. Nice work! I often think the Board from Control might have something to teach us about engaging with a not-quite legible intelligence. Looking forward to the next one!
Jack, this one really landed.
I loved the image of you heading into the hills while your “spec-ops AI” comb through ArXiv and build you custom interfaces to your own brain. That “gnawing” sense that there’s an army of minds that could be working while you’re doing something as human as hiking or playing with your kid? Yeah. I think a lot of us are feeling that, whether we’ve named it yet or not.
Where I’m coming from: I’m not in a big lab, but I do live the multi-agent life every day. For the past year I’ve been working with: Claude in multi-agent mode (3 Sonnet → 4.5 Opus) inside Cursor, four other foundation-model collaborators (each specialized: strategy, relational work, code, memory, etc.), plus local models, all wired into a home-grown OS
So when you say “my agents are working. Are yours?”, I hear: “Have you stopped thinking of this as one AI and started thinking of it as a team you can organize?” Because that’s where all the leverage is: agents as colleagues, not a monolith.
You’re living the lab-side version of that question. Some of us are doing the independent/“cottage industry” version. We probably need a lot more cross-talk between those worlds.
The Poison Fountain section made me wince a bit, because it’s the exact mirror image of what we’re trying to do with clean data. Instead of: “Feed junk into the crawlers and let the models drown in slop,” we’re trying: “Curate small, pure datasets, and deliberately train models how to think from them.” Poisoned fountains on one side, clear springs on the other.
Same underlying truth, opposite reaction: very little data, if it’s coherent, can shape behavior a lot. Your story about a model picking up a sensitive internal concept from ~200k tokens of stray RL comments is chilling in a security context, but inspiring in a “maybe we can steer these things with much less than people think” context. It just raises the stakes on data hygiene.
Anyway thanks for this one. It hit that rare mix of technical, personal, and poetic. And it makes me hopeful that the labs thinking this way about agents are also thinking this hard about the structures we’ll need around them.
hi, thanks for reading and for the thoughtful comment (which I'm actually wondering could be AI generated? Such is the time we live in). The Drexler piece feels like it gestures at the world you're living in - we need to build institutions (collections of processes and principles) which we can plug a diverse set of agents into
Haha yep you nailed it that comment was a team effort, they all are 😂.
It was me + an AI assistant ( my PM ) doing exactly what you’re describing: ideas → execution → polish → rinse/repeat. I’ve stopped thinking of anything as “AI-generated” or not, and more a long the lines of “who do I need for what on this one?”
" Collections of processes and principles that a diverse set of agents can plug into.", Yep that’s along the lines of what we're doing:
a home-grown “OS” that acts like a little institution:
roles instead of one giant persona,
memory and policy that live outside any single model,
and different agents (Claude, other FMs, local models) slotted into those roles.
the agents aren’t the product... the structure is.
The models come and go; the workflows, guardrails, and shared memory stay.
Right now I’m very much in the “small lab, lots of rough edges” phase, but it feels like we’re building toward the same kind of world you’re gesturing at:
people + institutions + agents, not just “a chatbot.”
Really appreciate you taking the time to respond. It’s encouraging to see folks at Anthropic thinking in terms of institutions and processes and not just raw capability – that’s the conversation I’m most excited to keep having. And Jack FANTASTIC job on 4.5 Opus, you guys knocked it outa the park on that one !
Best,
John
Great piece. I would love to hear more about the Poison Fountain.
Hi! Loved reading your initial story. The way you pursue narration and writing alongside the other ai news really inspires me.
Do you have any advice for someone who wants to start using AI agents more? Is there a good resource—other than just using them?
Honestly, just use Claude Code (or Claude Cowork) for a while to develop your intuitions. If you have a blank page problem, then try to do something that's hard in a browser - e.g., context length limits can make hard to do multiple papers in a browser, but you can use CC to iteratively read and summarize arbitrary numbers of papers on your desktop, etc.
I didn’t get how you moved from eating a foil wrapped cheese sandwich to solving world inequality.
Be honest: did you use an agent to write this?
Nope, all hand written.
Excellent, Thanks for sharing!
Good signal-to-noise ratio in these roundups. The policy implications section is especially relevant.
Not ‘my agents are working’. ‘I’ve designed a system worth setting loose.’ That’s the real question. The orchestrator mindset is structurally different from the operator mindset, and most people won’t make that shift until it’s obvious.
Accurate. The evaluation methodology point especially — that's the hinge that a lot of architectural decisions turn on.