Lex Fridman Podcast
#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters
Nathan Lambert
And what this means is when you look at the common models around that most people have been able to interact with that are open, right? Think LAMA. LAMA is a dense model. i.e. every single parameter or neuron is activated as you're going through the model for every single token you generate, right? Now, with a mixture of experts model, you don't do that, right?
0
💬
0
Comments
Log in to comment.
There are no comments yet.