Lex Fridman Podcast
#447 – Cursor Team: Future of Programming with AI
Aman Sanger
So if you're doing some math problem, let's look at that final thing you've done, everything, and let's assign a grade to it, how likely we think, like what's the reward for this outcome. Process reward models instead try to grade the chain of thought.
0
💬
0
Comments
Log in to comment.
There are no comments yet.