Lex Fridman Podcast
#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters
Nathan Lambert
And the model will learn which expert to route to for different tasks. And so this is a humongous innovation in terms of, hey, I can continue to grow the total embedding space of parameters. And so DeepSeq's model is, you know, 600 something billion parameters, right? Relative to LAMA-405b, it's 405 billion parameters, right? Relative to LAMA-70b, it's 70 billion parameters, right?
0
💬
0
Comments
Log in to comment.
There are no comments yet.