Menu
Sign In Add Podcast

Lex Fridman Podcast

#446 – Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

4380.668 - 4397.863 Arvid Lundmark

Yeah. So like the basic, the size of your KV cache is both the size of all your prompts multiplied by the number of prompts being processed in parallel. So you could increase either those dimensions, right? The batch size or the size of your prompts without degrading the latency of generating tokens.

0
💬 0

Comments

There are no comments yet.

Log in to comment.