Zico Colter
👤 PersonAppearances Over Time
Podcast Appearances
If you train a generative model on those exact same images, generate more synthetic data from that generative model, and then train on that more synthetic data, you don't do that much better, but you do a little bit better. And that's just wild. What that means is our current algorithms, we are not yet maximally extracting the information from data we have.
And there are way more deductions and inferences and other processes that we can apply to our current data to provide more value. And as models get bigger and better, they arguably can kind of do this themselves, either through synthetic data or through different mechanisms by which we train these models.
And there are way more deductions and inferences and other processes that we can apply to our current data to provide more value. And as models get bigger and better, they arguably can kind of do this themselves, either through synthetic data or through different mechanisms by which we train these models.
And there are way more deductions and inferences and other processes that we can apply to our current data to provide more value. And as models get bigger and better, they arguably can kind of do this themselves, either through synthetic data or through different mechanisms by which we train these models.
I don't really know, to be honest. I think this is a major open question right now in research. How do we extract the maximal information content from the data that we have? But again, as I said, I don't think we're close to even extracting all the data that's available.
I don't really know, to be honest. I think this is a major open question right now in research. How do we extract the maximal information content from the data that we have? But again, as I said, I don't think we're close to even extracting all the data that's available.
I don't really know, to be honest. I think this is a major open question right now in research. How do we extract the maximal information content from the data that we have? But again, as I said, I don't think we're close to even extracting all the data that's available.
When I look at this landscape, and we know that we aren't close to extracting the maximal information in the closure set of all the data that we have available... And we have not come close to processing all the data that's available to us. The idea that somehow this is a recipe for models plateauing in performance just doesn't jive to me with the reality of what we see.
When I look at this landscape, and we know that we aren't close to extracting the maximal information in the closure set of all the data that we have available... And we have not come close to processing all the data that's available to us. The idea that somehow this is a recipe for models plateauing in performance just doesn't jive to me with the reality of what we see.
When I look at this landscape, and we know that we aren't close to extracting the maximal information in the closure set of all the data that we have available... And we have not come close to processing all the data that's available to us. The idea that somehow this is a recipe for models plateauing in performance just doesn't jive to me with the reality of what we see.
We have not yet reached an equilibrium point where we have a good sense of sort of what the steady state of model size and for what application and how it's being used and is it being used as a general purpose system or for a very specific reason. This is all still being figured out right now. What I will say is that I use these models very regularly for my daily work.
We have not yet reached an equilibrium point where we have a good sense of sort of what the steady state of model size and for what application and how it's being used and is it being used as a general purpose system or for a very specific reason. This is all still being figured out right now. What I will say is that I use these models very regularly for my daily work.
We have not yet reached an equilibrium point where we have a good sense of sort of what the steady state of model size and for what application and how it's being used and is it being used as a general purpose system or for a very specific reason. This is all still being figured out right now. What I will say is that I use these models very regularly for my daily work.
I work almost exclusively with the largest models that are available to me because it just works better. And when I don't have a given task that I'm doing over and over, when I want to have that generality, I want to work with the larger models that are available.
I work almost exclusively with the largest models that are available to me because it just works better. And when I don't have a given task that I'm doing over and over, when I want to have that generality, I want to work with the larger models that are available.
I work almost exclusively with the largest models that are available to me because it just works better. And when I don't have a given task that I'm doing over and over, when I want to have that generality, I want to work with the larger models that are available.
The notion of sort of small language models and this kind of stuff, and again, I think this might be very much a possibility in the future. It kind of comes after we reach this point of generality, right? So once we've done something enough and we realize, okay, there is still a small task we want to do many, many times, maybe before we would have used...
The notion of sort of small language models and this kind of stuff, and again, I think this might be very much a possibility in the future. It kind of comes after we reach this point of generality, right? So once we've done something enough and we realize, okay, there is still a small task we want to do many, many times, maybe before we would have used...
The notion of sort of small language models and this kind of stuff, and again, I think this might be very much a possibility in the future. It kind of comes after we reach this point of generality, right? So once we've done something enough and we realize, okay, there is still a small task we want to do many, many times, maybe before we would have used...
custom-trained machine learning model for this. But the idea is that once you have a task, a rote task that you're repeating again and again enough times, and you know a small model can do it, it probably does become valuable to specialize a small model for that task only.