Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

15152.193 - 15169.987 Nathan Lambert

And so they decided, oh, well, I'll just make the vocabulary large too, even though it makes no sense to do so on such a small model, because that fits on their hardware. So Gemma doesn't run as efficiently on a GPU as a Lama does, right? But vice versa, Lama doesn't run as efficiently on a TPU as a Gemma does. And so there's certain aspects of hardware software co-design.

💬 0

Comments

There are no comments yet.

Back to full episode

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Comments

Login Required