Lex Fridman Podcast
#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters
Dylan Patel
And same goes for what Dylan mentioned with multi-head latent attention. It's all about reducing memory usage during inference and same things during training by using some fancy low-rank approximation math.
0
💬
0
Comments
Log in to comment.
There are no comments yet.