Lex Fridman Podcast

#446 – Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

2934.684 - 2954.71 Michael Truel

And so, for instance, one of the most popular agent benchmarks, SweetBench, is really, really contaminated in the training data of these foundation models. And so if you ask these foundation models to do a sweet bench problem, but you actually don't give them the context of a code base, they can like hallucinate the right file pass, they can hallucinate the right function names.

💬 0

Comments

There are no comments yet.

Back to full episode

Lex Fridman Podcast

#446 – Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America

Comments

Login Required