Lex Fridman Podcast
#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters
Dylan Patel
Yeah, especially in the DeepSeq v3, which is their pre-training paper. They were very clear that they are doing interventions on the technical stack that go at many different levels. For example, to get highly efficient training, they're making modifications at or below the CUDA layer for NVIDIA chips.
0
💬
0
Comments
Log in to comment.
There are no comments yet.