Menu
Sign In Pricing Add Podcast

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

2258.408 - 2271.52 Nathan Lambert

And so versus, again, the LAMA model, 70 billion parameters must be activated or 405 billion parameters must be activated. So you've dramatically reduced your compute cost when you're doing training and inference. with this mixture of experts architecture.

0
💬 0

Comments

There are no comments yet.

Log in to comment.