Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

15140.557 - 15152.153 Nathan Lambert

But then it's actually eight billion parameters because the vocabulary is so large. And the reason they made the vocabulary so large is because TPUs like matrix multiply unit is massive. Because that's what they've like sort of optimized for.

💬 0

Comments

There are no comments yet.

Back to full episode

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Comments

Login Required