Lex Fridman Podcast
#447 – Cursor Team: Future of Programming with AI
Aman Sanger
With group query, you instead preserve all the query heads, and then your keys and values are kind of... There are fewer heads for the keys and values, but you're not reducing it to just one. But anyways, the whole point here is you're just reducing the size of your KV cache.
0
💬
0
Comments
Log in to comment.
There are no comments yet.