Leandro Fonvera
👤 PersonPodcast Appearances
So you can imagine it a little bit like GitHub, if you're familiar with GitHub, where people share code and everything's free.
So we, our job is not to make money. Our job is mostly to... To spend money. To spend money and build things that are very useful.
We've upped the exams a little bit, so now we're closer to PhD-level exams. And we can measure quite well how many of the questions does a model get right.
Yeah, so those models are getting really good at solving certain kinds of questions. So, for example, these models can solve some of, for example, math Olympiad questions.
Exactly. I also, I'm like a physicist by training and it takes exercise to be good at those questions. Yeah.
Yeah. Capability-wise, we don't see any benchmarks that show that they have some gaps in the knowledge.
So we test these models on kind of exams. If those exams are already in the training data, naturally the models are much better.
Exactly. Yeah. And we haven't seen any indication of that either.
Yeah. I mean, that's pretty much what we did. So.
So we're not like reverse engineering in the dark. We're actually more like following the recipe and translating their paper to code. And I think we're making good progress. So I think in a few weeks, the latest, we're going to have a pipeline that works, that people can use. And we're going to see if we get the same numbers.
Yeah, so I think that's something that we want to investigate a bit. So far, it seems like napkin calculation. It's probably the right order of magnitude.
And I think one thing that people underappreciate is an open model is kind of a leveling, levels the whole field because everybody has access to the same level of knowledge. So everybody can immediately build on top of that.