Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

2258.408 - 2271.52 Nathan Lambert

And so versus, again, the LAMA model, 70 billion parameters must be activated or 405 billion parameters must be activated. So you've dramatically reduced your compute cost when you're doing training and inference. with this mixture of experts architecture.

💬 0

Comments

There are no comments yet.

Back to full episode

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Comments

Login Required