The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch
20VC: Deepseek Special: Is Deepseek a Weapon of the CCP | How Should OpenAI and the US Government Respond | Why $500BN for Stargate is Not Enough | The Future of Inference, NVIDIA and Foundation Models with Jonathan Ross @ Groq
Jonathan Ross
However, because it's sparse, because it's a mixture of experts, they're not doing as much computation. And part of the cleverness was figuring out how they could have so many experts so it could be so sparse so they could skip so many of the parameters.
0
💬
0
Comments
Log in to comment.
There are no comments yet.