Lex Fridman Podcast
#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity
Lex Fridman
It's both. And so deception, and that's where mechanistic interpretability comes into play. And hopefully the techniques used for that are not made accessible to the model.
0
💬
0
Comments
Log in to comment.
There are no comments yet.