Menu
Sign In Add Podcast

Lex Fridman Podcast

#446 – Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

7946.505 - 7966.463 Arvid Lundmark

Yeah, so RLHF is when the reward model you use is trained from some labels you've collected from humans giving feedback. I think this works if you have the ability to get a ton of human feedback for this kind of task that you care about.

0
💬 0

Comments

There are no comments yet.

Log in to comment.