Lex Fridman Podcast
#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters
Dylan Patel
We use this in our like Tulu series of models and you can elicit the same behaviors where you say like weight and so much on, but it's so late in the training process that this kind of reasoning expression is much lighter. Yeah. So there's essentially a gradation, and just how much of this RL training you put into it determines how the output looks.
0
💬
0
Comments
Log in to comment.
There are no comments yet.