Menu
Sign In Pricing Add Podcast

Lex Fridman Podcast

#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity

17581.424 - 17601.822

Gradient descent was the whole time, you were trying to go and do this, gradient descent was actually in the behind the scenes going and searching more efficiently than you could through the space of sparse models and going and learning whatever sparse model was most efficient and then figuring out how to fold it down nicely to go and run conveniently on your GPU, which does, you know, nice dense matrix multiplies and that you just can't beat that.

0
💬 0

Comments

There are no comments yet.

Log in to comment.