Lex Fridman Podcast
#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity
So maybe to give an example of the kind of thing that has been done that I wouldn't consider to be mechanistic interpretability, there was for a long time a lot of work on saliency maps where you would take an image and you try to say, you know, the model thinks this image is a dog. What part of the image made it think that it's a dog?
0
💬
0
Comments
Log in to comment.
There are no comments yet.