Menu
Sign In Pricing Add Podcast

Coder Radio

589: Blame the Tools using the Tools

2124.44 - 2145.754 Mark Zuckerberg

So that leads to this dynamic where Llama 3, we could train on 10,000 to 20,000 GPUs. Llama 4, we could train on more than 100,000 GPUs. Llama 5, we can plan to scale even further. And there's just an interesting question of how far that goes. It's totally possible.

0
💬 0

Comments

There are no comments yet.

Log in to comment.