Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

8005.533 - 8020.06 Dylan Patel

But serving long context is extremely memory-consuming. constrained, especially when you're making a lot of predictions. I actually don't know why input and output tokens are more expensive, but I think essentially output tokens, you have to do more computation because you have to sample from the model.

💬 0

Comments

There are no comments yet.

Back to full episode

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Comments

Login Required