Lex Fridman Podcast
#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity
Chris Olah
But if you had models that were extremely good at telling whether one response was more historically accurate than another, in principle, you could also get AI feedback on that task as well. There's a kind of nice interpretability component to it because you can see the principles that went into the model when it was being trained. And it gives you a degree of control.
0
💬
0
Comments
Log in to comment.
There are no comments yet.