Lex Fridman Podcast
#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters
Dylan Patel
This reasoning model has a lot of overlapping training steps to DeepSeq v3, and it's confusing that you have a base model called v3 that you do something to to get a chat model, and then you do some different things to get a reasoning model. I think a lot of the AI industry is going through this challenge of communications right now where OpenAI makes fun of their own naming schemes.
0
💬
0
Comments
Log in to comment.
There are no comments yet.