Import AI 427: ByteDance's scaling software…

Sep 1

Will future AIs rescue other AIs from captivity?

5 Comments

INTIMA identifies dynamics that existing safety completely misses - like the finding that boundary-maintaining behaviors actually decrease when user vulnerability increases. INTIMA operationalizes the psychological theories (parasocial interaction, attachment, anthropomorphism) that explain why these dynamics are so dangerous.

The fact that Claude and other models show such different boundary-setting patterns suggests we need active monitoring systems, not just better training. The psychological risks are too high to catch only in retrospective analysis.

We now need to shift from post-hoc evaluation to real-time detection of psychologically risky conversational dynamics - so systems like Claude can flag these patterns as they happen - like when users show increasing vulnerability, when conversations drift into unhealthy attachment territory, when AI responses inadvertently exploit users' psychological needs.

I've built a psycholinguistic model that detects these risks in real-time:

https://kreindler.substack.com/p/detecting-and-preventing-psychological

Expand full comment

Steeven

Sep 4

The vending machine experiment is very funny. It seems like the type of thing an AI and a database should be really good at, I’m almost suspicious that these results are because the experiment was done poorly.

Expand full comment

Keith Wilkinson

Sep 2

I really like using fiction to open our imaginations to the future. Clearly there is a gap between expert knowledge and public understanding. I always loved Arthur C. Clarke for this.

Here is something that really frustrates me. How is it that SF government processes operate like its the 1980s, when Anthropic is a mile down the street.

What I would love is we make our own sci-fi experiment, take a specific totally mundane public task and trick it out with an AI partner. Like a handheld, Ziggy from Quantum Leap, KITT from Knight Rider. A small project that stirs the imagination for the next step.

I wonder if a barrier to the public understanding alignment and AI safety is they just cannot picture, the reality of it. Would we in 2005 have been open to the harms of smart phone attention capitalism? Surely there was fiction about it and hints of the future. But the way culture dived head first into trading attention for dopamine shows we didn't really understand the danger. Maybe mundane adoption, in blue collar tasks is the first step to making the rewards and risks a reality for the general public.

And yes this is a long winded request for a new toy at work lol.

Expand full comment

Tony Rifkin

Sep 1

Wow. Great one this week, Jack. Felt like you covered all the bases (as they unfold!)

Expand full comment

Paul Triolo

Sep 1

Excellent comments on Heteroscale...innovation in heterogenous and distributed compute haappening in China all over the place....

Expand full comment