Coder Radio
605: The Democrats Behind DeepSeek
Chris
So then version three comes out yesterday and it builds on top of all of that stuff. But that was the big, kind of some of the big breakthroughs. But then version three comes out with an even better approach to load balancing, which then further reduces communication overhead and multi-token prediction and training, which that made it cheaper to train by a lot
0
💬
0
Comments
Log in to comment.
There are no comments yet.