Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

17439.005 - 17458.578 Dylan Patel

And this is something that the deep seek paper talked about as well as like at this bigger model, it's easier to elicit powerful capabilities with this RL training. And then they distill it down from that big model to the small model. And this model we released today, we saw the same thing as it were AI too. We don't have a ton of compute. We can't train four or five B models all the time.

💬 0

Comments

There are no comments yet.

Back to full episode

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Comments

Login Required