Menu
Sign In Pricing Add Podcast

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

823.669 - 849.783 Dylan Patel

Yeah, so DeepSeq v3 is a new mixture of experts, transformer language model from DeepSeq, who is based in China. They have some new specifics in the model that we'll get into. Largely, this is a open weight model, and it's a instruction model like what you would use in ChatGPT. They also released what is called the base model, which is before these techniques of post-training.

0
💬 0

Comments

There are no comments yet.

Log in to comment.