Lex Fridman Podcast
#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters
Dylan Patel
Yeah, so DeepSeq v3 is a new mixture of experts, transformer language model from DeepSeq, who is based in China. They have some new specifics in the model that we'll get into. Largely, this is a open weight model, and it's a instruction model like what you would use in ChatGPT. They also released what is called the base model, which is before these techniques of post-training.
0
💬
0
Comments
Log in to comment.
There are no comments yet.