Lex Fridman Podcast
#459 – DeepSeek, China, OpenAI, NVIDIA, xAI, TSMC, Stargate, and AI Megaclusters
Dylan Patel
So they're doing this batch processing where not all of the prompts are exactly the same, really complex handling. And then as context length gets longer, there's this, I think you call it critical batch size, where your ability to serve So how much you can parallelize your inference plummets because of this long contract.
0
💬
0
Comments
Log in to comment.
There are no comments yet.