Menu
Sign In Pricing Add Podcast

Lex Fridman Podcast

#452 – Dario Amodei: Anthropic CEO on Claude, AGI & the Future of AI & Humanity

12175.599 - 12203.434 Chris Olah

One reason is just that you are training the model on exactly the task and with a lot of data that represents many different angles on which people prefer and disprefer responses. I think there is a question of are you eliciting things from pre-trained models or are you teaching new things to models? And in principle, you can teach new things to models in post-training.

0
💬 0

Comments

There are no comments yet.

Log in to comment.