Lex Fridman Podcast
#447 – Cursor Team: Future of Programming with AI
Aman Sanger
But no, it actually may work if the language model has a much easier time verifying some solution than it does generating it. Then you actually could perhaps get this kind of recursive loop. I don't think it's going to look exactly like that. The other thing you could do is... we kind of do is a little bit of a mix of RLA-IF and RLA-HF, where usually the model is actually quite correct.
0
💬
0
Comments
Log in to comment.
There are no comments yet.