Lex Fridman Podcast
#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity
Chris Olah
So there's a couple of components of it. The main component I think people find interesting is the kind of reinforcement learning from AI feedback. So you take a model that's already trained and you show it two responses to a query and you have a principle. So suppose the principle, like we've tried this with harmlessness a lot. So suppose that the query is about
0
💬
0
Comments
Log in to comment.
There are no comments yet.