Zico Colter
👤 PersonAppearances Over Time
Podcast Appearances
I think that it's been evolving so quickly between recent releases of open source models, continued progress in the closed source models. There was also this proliferation early on of a lot of open source models that none were better than the other, and they just involved a lot of training for companies, for lack of a better word, to just demonstrate that they could do it too.
I think that it's been evolving so quickly between recent releases of open source models, continued progress in the closed source models. There was also this proliferation early on of a lot of open source models that none were better than the other, and they just involved a lot of training for companies, for lack of a better word, to just demonstrate that they could do it too.
I think that it's been evolving so quickly between recent releases of open source models, continued progress in the closed source models. There was also this proliferation early on of a lot of open source models that none were better than the other, and they just involved a lot of training for companies, for lack of a better word, to just demonstrate that they could do it too.
It's not clear that's a valuable thing. Why would you want to train your own language model from scratch if there are very good open source ones now? Will that continue? Maybe, maybe not. I think there will be most likely consolidation, but I'm not quite sure how it will play out.
It's not clear that's a valuable thing. Why would you want to train your own language model from scratch if there are very good open source ones now? Will that continue? Maybe, maybe not. I think there will be most likely consolidation, but I'm not quite sure how it will play out.
It's not clear that's a valuable thing. Why would you want to train your own language model from scratch if there are very good open source ones now? Will that continue? Maybe, maybe not. I think there will be most likely consolidation, but I'm not quite sure how it will play out.
I do think that there are a lot of companies that are right now thinking about training their own models and things like this. And it's just sort of the default that, of course, you would do this, that this won't be an economically viable thing to do in the future. And so it won't happen anymore.
I do think that there are a lot of companies that are right now thinking about training their own models and things like this. And it's just sort of the default that, of course, you would do this, that this won't be an economically viable thing to do in the future. And so it won't happen anymore.
I do think that there are a lot of companies that are right now thinking about training their own models and things like this. And it's just sort of the default that, of course, you would do this, that this won't be an economically viable thing to do in the future. And so it won't happen anymore.
I'm not really sure what the rationale is for saying that we've plateaued in the compute sense. Most scaling laws that I've seen certainly suggest it can keep going. It's more expensive. You could argue that it's by far just scaling may not be the most efficient way. to achieve better results, and I actually think that's very likely true.
I'm not really sure what the rationale is for saying that we've plateaued in the compute sense. Most scaling laws that I've seen certainly suggest it can keep going. It's more expensive. You could argue that it's by far just scaling may not be the most efficient way. to achieve better results, and I actually think that's very likely true.
I'm not really sure what the rationale is for saying that we've plateaued in the compute sense. Most scaling laws that I've seen certainly suggest it can keep going. It's more expensive. You could argue that it's by far just scaling may not be the most efficient way. to achieve better results, and I actually think that's very likely true.
There are other better ways, you could argue, to achieve the same level of improvement than compute, but compute still does seem to be both a major factor and b still seems to improve things. It's more a calculus about also the monetary trade-offs of how much models will cost at inference time and how much they cost to train and all this kind of stuff.
There are other better ways, you could argue, to achieve the same level of improvement than compute, but compute still does seem to be both a major factor and b still seems to improve things. It's more a calculus about also the monetary trade-offs of how much models will cost at inference time and how much they cost to train and all this kind of stuff.
There are other better ways, you could argue, to achieve the same level of improvement than compute, but compute still does seem to be both a major factor and b still seems to improve things. It's more a calculus about also the monetary trade-offs of how much models will cost at inference time and how much they cost to train and all this kind of stuff.
These are much more, I would say, kind of becoming more practical concerns than a concern about the actual limits of scaling.
These are much more, I would say, kind of becoming more practical concerns than a concern about the actual limits of scaling.
These are much more, I would say, kind of becoming more practical concerns than a concern about the actual limits of scaling.
This question is, I think, super interesting. And those are not actually mutually exclusive, to be clear. One thing I'll also say is that the term AGI is thrown around a whole lot. I define AGI as a system that acts functionally equivalent to a close collaborator of yours over the course of about a year-long project.
This question is, I think, super interesting. And those are not actually mutually exclusive, to be clear. One thing I'll also say is that the term AGI is thrown around a whole lot. I define AGI as a system that acts functionally equivalent to a close collaborator of yours over the course of about a year-long project.