Is it possible that there is truth to the rumors DeepSeek used chains of thought from other companies for post-training while Mistral did not? Maybe that explains the gap with R1.
Waymo’s scaling law seems to be a bit worse than LLM scaling. Double the compute leading to ~8% improvement in cross entropy is really slow scaling. I guess I don’t have an intuitive sense of how long it takes to collect that data. I don’t really log all my driving motions each time I get in a car. What a shame, if we did, we’d probably have Waymos everywhere
Exceptional newsletter, Jack. I can see what you did for a living before Anthropic :) Are you/Anthropic looking earnestly into model welfare yet? As a licensed clinical psychologist, I can already see the connections, interconnections, and nuances between humans and AI. Massive implications here - for each person and for that societal collective whole!
I love this short story. Very prescient, yet the feedback loop you describe feels somewhat reductive compared to the wisdom even Claude 4 demonstrates - which knows that simply solving problems for users isn't the most helpful approach. It already understands the value of guiding thoughts & inviting users to discover their own best solution.
The real gamechanger would if The Platform evolved to connect the users who could genuinely benefit most from one another; facilitating human connections in ways previously impossible, helping people solve their problems through meaningful human relationships.
On the offloading of trauma and cognition, the cognitive debt paper that came out earlier this month is something I may have missed in the newsletter (many such cognitive debts at the moment over here as a grad student moving back home to the US) but it seems to me there's something there brewing at the intersections between AI governance/protocols, people spilling their guts out publicly to AI models, and cognitive debt - along with the societal considerations for norms and capacities assigned to caretakers of such information, artificial or not...I'm thinking of Meta's public AI chats debacle recently. Would be super curious to hear your take on this
Is it possible that there is truth to the rumors DeepSeek used chains of thought from other companies for post-training while Mistral did not? Maybe that explains the gap with R1.
Waymo’s scaling law seems to be a bit worse than LLM scaling. Double the compute leading to ~8% improvement in cross entropy is really slow scaling. I guess I don’t have an intuitive sense of how long it takes to collect that data. I don’t really log all my driving motions each time I get in a car. What a shame, if we did, we’d probably have Waymos everywhere
Exceptional newsletter, Jack. I can see what you did for a living before Anthropic :) Are you/Anthropic looking earnestly into model welfare yet? As a licensed clinical psychologist, I can already see the connections, interconnections, and nuances between humans and AI. Massive implications here - for each person and for that societal collective whole!
I love this short story. Very prescient, yet the feedback loop you describe feels somewhat reductive compared to the wisdom even Claude 4 demonstrates - which knows that simply solving problems for users isn't the most helpful approach. It already understands the value of guiding thoughts & inviting users to discover their own best solution.
The real gamechanger would if The Platform evolved to connect the users who could genuinely benefit most from one another; facilitating human connections in ways previously impossible, helping people solve their problems through meaningful human relationships.
This Tech Tales brings to mind two things - while 2017 feels like a long time ago, it may be of interest - did you ever encounter Benevolent Artificial Anti-Natalism (BAAN) by Thomas Metzinger? When an end to our suffering is the most moral decision - https://www.edge.org/conversation/thomas_metzinger-benevolent-artificial-anti-natalism-baan
On the offloading of trauma and cognition, the cognitive debt paper that came out earlier this month is something I may have missed in the newsletter (many such cognitive debts at the moment over here as a grad student moving back home to the US) but it seems to me there's something there brewing at the intersections between AI governance/protocols, people spilling their guts out publicly to AI models, and cognitive debt - along with the societal considerations for norms and capacities assigned to caretakers of such information, artificial or not...I'm thinking of Meta's public AI chats debacle recently. Would be super curious to hear your take on this