Lex Fridman Podcast

#446 – Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

7946.505 - 7966.463 Arvid Lundmark

Yeah, so RLHF is when the reward model you use is trained from some labels you've collected from humans giving feedback. I think this works if you have the ability to get a ton of human feedback for this kind of task that you care about.

💬 0

Comments

There are no comments yet.

Back to full episode

Lex Fridman Podcast

#446 – Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

Comments

Login Required