5 Comments
User's avatar
Shon Pan's avatar

Distributed training seems extremely concerning. It does seem like there is a bit of an efficiency drop, but seems like it would massively permit bad actor to create powerful models unless there are things like on-chip controls to help hedge it off.

Lennart's comment on memory controls is also option, at least for reasoning models.

sepiatone's avatar

> Lennart's comment on memory controls

Where could I find this?

Shon Pan's avatar

Its on his twitter.

Frank Herfert's avatar

Really enjoyed the story!

Omari's avatar

It would be really interesting IMO to see a model try to run a youtube channel, with all of the challenges that it entails. E.g. choosing an audience, an upload schedule, as well as keeping up to date with and staying ahead of trends.