Menu
Sign In Add Podcast

Lex Fridman Podcast

#447 – Cursor Team: Future of Programming with AI

7322.792 - 7345.104 Aman Sanger

So like what people do in all these papers is they sample a bunch of outputs from the language model and then use the process reward models to grade all those generations alongside maybe some other heuristics and then use that to choose the best answer. The really interesting thing that people think might work and people want to work is tree search with these process reward models.

0
💬 0

Comments

There are no comments yet.

Log in to comment.