Lex Fridman Podcast
#446 – Ed Barnhart: Maya, Aztec, Inca, and Lost Civilizations of South America
Michael Truel
And so, for instance, one of the most popular agent benchmarks, SweetBench, is really, really contaminated in the training data of these foundation models. And so if you ask these foundation models to do a sweet bench problem, but you actually don't give them the context of a code base, they can like hallucinate the right file pass, they can hallucinate the right function names.
0
💬
0
Comments
Log in to comment.
There are no comments yet.