Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

2879.714 - 2900.668 Nathan Lambert

Going back to sort of the like efficiency and complexity point, right? It's 32 versus four, right? For like mixed draw and other MOE models that have been publicly released. So this ratio is extremely high. And sort of what Nathan was getting at there was when you have such a different level of sparsity, you can't just have every GPU have the entire model, right? The model's too big.

💬 0

Comments

There are no comments yet.

Back to full episode

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Comments

Login Required