Menu
Sign In Add Podcast

Lex Fridman Podcast

#447 – Cursor Team: Future of Programming with AI

4268.613 - 4295.882 Sualeh Asif

there's less of them, but maybe the theory is that you actually want a lot of different, like you want each of the keys and values to actually be different. So one way to reduce the size is you keep one big shared vector for all the keys and values. And then you have smaller vectors for every single token, so that you can store only the smaller thing. There's some sort of low-rank reduction.

0
💬 0

Comments

There are no comments yet.

Log in to comment.