Lex Fridman Podcast
#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters
Nathan Lambert
Because now if you think about the parameter count as the sort of total embedding space for all of this knowledge that you're compressing down during training, When you're embedding this data in, instead of having to activate every single parameter every single time you're training or running inference, now you can just activate a subset.
0
💬
0
Comments
Log in to comment.
There are no comments yet.