Menu
Sign In Pricing Add Podcast

Coder Radio

605: The Democrats Behind DeepSeek

531.893 - 550.579 Chris

So then version three comes out yesterday and it builds on top of all of that stuff. But that was the big, kind of some of the big breakthroughs. But then version three comes out with an even better approach to load balancing, which then further reduces communication overhead and multi-token prediction and training, which that made it cheaper to train by a lot

0
💬 0

Comments

There are no comments yet.

Log in to comment.