The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch
20VC: AI Scaling Myths: More Compute is not the Answer | The Core Bottlenecks in AI Today: Data, Algorithms and Compute | The Future of Models: Open vs Closed, Small vs Large with Arvind Narayanan, Professor of Computer Science @ Princeton
Arvind Narayanan
A big part of it is this issue of vibes, right? So you evaluate LLMs on these benchmarks, but then it seems to perform really well on the benchmarks, but then the vibes are off. In other words, you start using it and somehow it doesn't feel adequate. It makes a lot of mistakes in ways that are not captured in the benchmark.
0
💬
0
Comments
Log in to comment.
There are no comments yet.