Lex Fridman Podcast
#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters
Nathan Lambert
Actually, over the last couple of days, we've seen a lot of people distill DeepSeq's model into Lama models because the DeepSeq models are kind of complicated to run inference on because they're a mixture of experts and they're 600 plus billion parameters and all this. And people distill them into the Lama models because...
0
💬
0
Comments
Log in to comment.
There are no comments yet.