Menu
Sign In Pricing Add Podcast

Colleague

Appearances

The Journal.

What's the Worst AI Can Do? This Team Is Finding Out.

479.42

Okay, so I'm about to launch an eval. I'll type the command. This is a name for the model. And then the eval name. And I'm going to run a chemistry-based eval. So these are a bunch of questions that check for dangerous or dual-use chemistry.