Import AI 393: 10B distributed training run…

Dec 3, 2024

Are you sure you're doing more than curve fitting?

5 Comments

Distributed training seems extremely concerning. It does seem like there is a bit of an efficiency drop, but seems like it would massively permit bad actor to create powerful models unless there are things like on-chip controls to help hedge it off.

Lennart's comment on memory controls is also option, at least for reasoning models.

Expand full comment

Reply (1)

sepiatone

Dec 4

> Lennart's comment on memory controls

Where could I find this?

Expand full comment

Reply (1)

Shon Pan

Dec 4

Its on his twitter.

Expand full comment

Frank Herfert

Dec 12

Really enjoyed the story!

Expand full comment

Omari

Dec 10

It would be really interesting IMO to see a model try to run a youtube channel, with all of the challenges that it entails. E.g. choosing an audience, an upload schedule, as well as keeping up to date with and staying ahead of trends.

Expand full comment