Lex Fridman Podcast
#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters
Dylan Patel
They train the model to do this specifically where they have a section, which is reasoning, and then it generates a special token, which is probably hidden from the user most of the time, which says, okay, I'm starting to answer. So the model is trained to do this two-stage process on its own. If you use a similar model in, say, OpenAI, OpenAI's user interface is...
0
💬
0
Comments
Log in to comment.
There are no comments yet.