Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

14453.705 - 14471.01 Nathan Lambert

But then when you're exchanging weights, if you're not able to overlap communications and compute perfectly, there may be a time period where your GPUs are just idle and you're exchanging weights and you're like, hey, the model's updating. So you're exchanging the gradients, you do the model update, and then you start training again. So the power goes And it's super spiky.

💬 0

Comments

There are no comments yet.

Back to full episode

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Comments

Login Required