Lex Fridman Podcast
#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters
Dylan Patel
And this is something that the deep seek paper talked about as well as like at this bigger model, it's easier to elicit powerful capabilities with this RL training. And then they distill it down from that big model to the small model. And this model we released today, we saw the same thing as it were AI too. We don't have a ton of compute. We can't train four or five B models all the time.
0
💬
0
Comments
Log in to comment.
There are no comments yet.