Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

1883.869 - 1901.025 Dylan Patel

They train the model to do this specifically where they have a section, which is reasoning, and then it generates a special token, which is probably hidden from the user most of the time, which says, okay, I'm starting to answer. So the model is trained to do this two-stage process on its own. If you use a similar model in, say, OpenAI, OpenAI's user interface is...

💬 0

Comments

There are no comments yet.

Back to full episode

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Comments

Login Required