Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

2184.685 - 2198.093 Nathan Lambert

It's nowhere close to what a brain architecture is, but different portions of the model activate, right? You'll have a set number of experts in the model and a set number that are activated each time. And this dramatically reduces both your training and inference costs.

💬 0

Comments

There are no comments yet.

Back to full episode

Lex Fridman Podcast

#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters

Comments

Login Required