Lex Fridman Podcast
#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters
Nathan Lambert
But deep seeks implementation is so complex, right? Especially with their mixture of experts, right? People have done mixture of experts, but they're generally eight, 16 experts, right? And they activate two. So, you know, one of the words that we like to use is like sparsity factor, right? Or usage, right? So you might have four, you know, one fourth of your model activate, right?
0
💬
0
Comments
Log in to comment.
There are no comments yet.