Lex Fridman Podcast
#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters
Nathan Lambert
and I generate a token, and I append that KV, that one token I generated, and it's KV cash, and then I do it again, right? And so therefore, this is a non-parallel operation. And this is one where you have to, you know, in the case of pre-fill or prompt, you pull the whole model in and you calculate 20,000 tokens at once, right?
0
💬
0
Comments
Log in to comment.
There are no comments yet.