Lex Fridman Podcast
#447 – Cursor Team: Future of Programming with AI
Aman Sanger
The interesting thing here is this now has no effect on that time to first token pre-fill speed. The thing this matters for is now generating tokens. And why is that? Because when you're generating tokens, instead of... being bottlenecked by doing these super-paralyzable matrix multiplies across all your tokens.
0
💬
0
Comments
Log in to comment.
There are no comments yet.