Lex Fridman Podcast
#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters
Dylan Patel
And this is the sort of thing that just points to them innovating. And I'm sure all the labs that are training big MOEs are looking at this sort of things, which is getting away from the auxiliary loss. Some of them might already use it, but you just keep accumulating gains. And we'll talk about... the philosophy of training and how you organize these organizations.
0
💬
0
Comments
Log in to comment.
There are no comments yet.