The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch
20VC: AI Scaling Myths: More Compute is not the Answer | The Core Bottlenecks in AI Today: Data, Algorithms and Compute | The Future of Models: Open vs Closed, Small vs Large with Arvind Narayanan, Professor of Computer Science @ Princeton
Wed, 28 Aug 2024
Arvind Narayanan is a professor of Computer Science at Princeton and the director of the Center for Information Technology Policy. He is a co-author of the book AI Snake Oil and a big proponent of the AI scaling myths around the importance of just adding more compute. He is also the lead author of a textbook on the computer science of cryptocurrencies which has been used in over 150 courses around the world, and an accompanying Coursera course that has had over 700,000 learners. In Today's Episode with Arvind Narayanan We Discuss: 1. Compute, Data, Algorithms: What is the Bottleneck: Why does Arvind disagree with the commonly held notion that more compute will result in an equal and continuous level of model performance improvement? Will we continue to see players move into the compute layer in the need to internalise the margin? What does that mean for Nvidia? Why does Arvind not believe that data is the bottleneck? How does Arvind analyse the future of synthetic data? Where is it useful? Where is it not? 2. The Future of Models: Does Arvind agree that this is the fastest commoditization of a technology he has seen? How does Arvind analyse the future of the model landscape? Will we see a world of few very large models or a world of many unbundled and verticalised models? Where does Arvind believe the most value will accrue in the model layer? Is it possible for smaller companies or university research institutions to even play in the model space given the intense cash needed to fund model development? 3. Education, Healthcare and Misinformation: When AI Goes Wrong: What are the single biggest dangers that AI poses to society today? To what extent does Arvind believe misinformation through generative AI is going to be a massive problem in democracies and misinformation? How does Arvind analyse AI impacting the future of education? What does he believe everyone gets wrong about AI and education? Does Arvind agree that AI will be able to put a doctor in everyone's pocket? Where does he believe this theory is weak and falls down?
we're not going to have too many more cycles, possibly zero more cycles of a model that's almost an order of magnitude bigger in terms of the number of parameters than what came before and thereby more powerful. And I think a reason for that is data becoming a bottleneck. These models are already trained on essentially all of the data that companies can get their hands on.
So while data is becoming a bottleneck, I think more compute still helps, but maybe not as much as it used to.
This is 20VC with me, Harry Stebbings, and we're sitting down today with one of my favorite writers in AI. He's been a big proponent in the belief that despite what many people think, increasing the amount of compute from this point will be unlikely to increase model performance significantly moving forward.
I'm thrilled to welcome Arvind Narayanan, Professor of Computer Science at Princeton and the Director of the Center for Information Technology Policy. This is an incredible discussion that goes very deep on the bottlenecks in AI today, and you can watch it on YouTube by searching for 20VC.
But before we dive in today, all of you listening use tons of software every day. Sometimes it fills us with rage. You can't figure something out. The chatbot in the bottom right is useless. You keep getting bombarded with these useless pop-ups. And for those of you who build products, no one wants their product to feel like this. Thankfully, a company exists to help users without annoying them.
CommandBar. It does a couple of very helpful things. First, it's a chatbot that uses AI to give users extremely personalized responses and deflect tickets. But it can be beyond just text. It can also co-browse with the user and show them how to do things inside the UI. Magic.
But it can also detect when users would benefit from a proactive nudge, like a helpful hint, or an invitation to start a free trial. CommandBar is already used by world-class companies like Gusto, HashiCorp, Yotpo, and Angelist. If you're a product CX or marketing leader, check them out at commandbar.com slash harry.
And talking about incredible companies with Commandbar, I want to talk to you about a new venture fund making waves by taking a very different approach. It's a public venture fund anyone can invest in. not just institutions and accredited investors. The Fundrise Innovation Fund is democratizing venture capital, which could have big consequences for the industry.
The fund is already off to a good start with $100 million into some of the largest, most in-demand AI and data infrastructure companies. Companies like OpenAI, Anthropic, and Databricks. Check out the Innovation Fund's impressive list of investments for yourself by visiting fundrise.com slash 20VC.
Carefully consider the investment material before investing, including objectives, risks, charges and expenses. This and other information can be found in the Innovation Fund's prospectus at fundrise.com slash innovation. This is a paid sponsorship. And finally, let's talk about Squarespace. Squarespace is the all-in-one website platform for entrepreneurs to stand out and succeed online.
Whether you're just starting out or managing a growing brand, Squarespace makes it easy to create a beautiful website, engage with your audience, and sell anything from products to content, all in one place, all on your terms. What's blown me away is the Squarespace Blueprint AI and SEO tools.
It's like crafting your site with a guided system, ensuring it not only reflects your unique style, but also ranks well on search engines. Plus, their flexible payment options cater to every customer's needs, making transactions smooth and hassle-free. And the Squarespace AI? It's a content wizard helping you whip up text that truly resonates with your brand voice.
So if you're ready to get started, head to squarespace.com for a free trial. And when you're ready to launch, go to squarespace.com slash 20VC and use the code 20VC to save 10% off your first purchase of a website or domain.
You have now arrived at your destination. Arvind, I am so excited for this, dude. I was telling you just now, I am one of your biggest fans on the Substack newsletter. I can't wait for the book. So thank you so much for joining me today. Thank you. I really appreciate that.
I'm super excited for this conversation.
Now, I want to get pretty much straight into it, but for those that don't read the substat, which they should do, can you just provide a 60-second intro, some context on why you're so well-versed to speak on the topics that we are today?
Sure. So, I'm a professor of computer science, and I would say I do three things. One is technical AI research, and another is understanding the societal effects of AI, and the third is advising policymakers.
I would just love to start because you have done some work in the cryptocurrency field, done a lot of work in the crypto field. I'd just love to start before we dive in deep on infrastructure. How does the AI hype today compare to Bitcoin hype? How is it the same and how is it different?
So I spent years of my time on this. I really believed that decentralization could have tremendous societal impacts. How is this going to make society better? It was not the money angle. But by around 2018, I had started to get really disillusioned. And that was because of a couple of main things.
One is, in a lot of cases where I had thought crypto or blockchain was going to be the solution, I realized that that was not the case. While there is potential for crypto to help the world's unbanked, the tech is not the real bottleneck there. And the other part of it was just a philosophical aspect of this community.
I do believe that many of our institutions are in need of reform or maybe decentralization, whatever it is. And that includes academia, by the way, so many reforms so badly needed. And in an ideal world, we would have this, you know, hard but important conversation about how do you fix our institutions. But instead, these students have been sold on blockchain and they want to replace their
these institutions with a script. And that just didn't seem like the right approach to me. So both from a technical perspective, and from a philosophical perspective, I really soured on it. While there are harms around AI, I think it has been a net positive for society. I can't say the same thing about Bitcoin. Are we in an AI hype cycle right now? I think that's possible.
Generative AI companies specifically made some serious mistakes in the last year or two about how they went about things. What mistakes did they make, Harvind? So when ChatGPT was released, people found, you know, a thousand new applications for it, right? That OpenAI application. might not have anticipated. And that was great.
But I think developers, AI developers, took the wrong lesson from this. They thought that AI is so powerful and so special that you can just put these models out there and people will figure out what to do with them. They didn't think about actually building products, making things that people want, finding product market fit, and all those things that are so basic in tech.
But somehow, AI companies deluded themselves into thinking that the normal rules don't apply here.
I do want to ask, and we'll start with kind of the hardest question of all, but it's the most important, and you've written about this, and I loved your piece.
The kind of core question that everyone's asking right now is, does more compute equal an increased level of performance, or have we reached a point where it is misaligned and more compute will not create that significant spike in performance? Kevin Scott at Microsoft says, absolutely, we have a lot more room to run.
Why are you skeptical, and have we gotten to a stage of diminishing returns on compute?
So if we look at what's happened historically, the way in which compute has improved model performance is with companies building bigger models. In my view, at least the biggest thing that changed between GPT-3.5 and GPT-4 was the size of the model. And it was also trained with more data, presumably, although they haven't made the details of that public and more compute and so forth.
So I think that's running out. We're not going to have too many more cycles, possibly zero more cycles of a model that's almost an order of magnitude bigger in terms of the number of parameters than what came before and thereby more powerful. And I think a reason for that is data becoming a bottleneck.
These models are already trained on essentially all of the data that companies can get their hands on. So while data is becoming a bottleneck, I think more compute still helps, but maybe not as much as it used to. And the reason for that is that perhaps ironically, more compute allows one to build smaller models with the same capability level.
And that's actually the trend we've been seeing over the last year or so. As you know, you know, the models today have gotten somewhat smaller and cheaper than when GPT-4 initially came out, but with the same capability level. So I think that's probably going to continue. Are we going to see a GPT-5 that's as big a leap over GPT-4 as GPT-4 was over GPT-3? I'm frankly skeptical.
MARK BLYTH, Can we just take them one by one there? There was a lot of great things that I just want to unpack. You said there about kind of potentially the shortage of data being the bottleneck to performance.
A lot of people say, well, there's a lot of data that we haven't mined yet, which the obvious example that many have suggested is kind of YouTube, which has obviously, I think, 150 billion hours of video. And then secondarily to that, synthetic data, the creation of artificial data that isn't in existence yet. To what extent are those effective pushbacks?
Right. So there are a lot of sources that haven't been mined yet. But when we start to look at the volume of that data, how many tokens is that? I think the picture is a little bit different. 150 billion hours of video sounds really impressive.
But when you put that video through a speech recognizer and actually extract the text tokens out of it and deduplicate it and so forth, it's actually not that much. It's an order of magnitude smaller than than what some of the largest models today have already been trained with.
Now training on video itself, instead of text extracted from the video, that could lead to some new capabilities, but not in the same fundamental way that we've had before, where you have the emergence of new capabilities, right? Models being able to do things that people just weren't anticipating.
So like the kind of shock that the AI community had when I think back in the day, I think it was GPT-2,
was trained primarily on English text, and they had actually tried to filter out text in other languages to keep it clean, but a tiny amount of text from other languages had gotten into it, and it turned out that that was enough for the model to pick up a reasonable level of competence for conversing in various other languages.
So these are the kinds of emergent capabilities that really spooked people, that has led to both a lot of hype and a lot of fears about what bigger and bigger models are going to be able to do. But I think that has pretty much run out because we're training on all of the capabilities that humans have expressed, like translating between languages, and have already put out there in the form of text.
So if you make the data set a little bit more diverse with YouTube video, I don't think that's fundamentally going to change. Multimodal capabilities, yes, there's a lot of room there. But new, emergent text capabilities, I'm not sure. MARK BLYTH What about synthetic data?
What about the creation of new data that doesn't exist yet?
Yeah, let's talk about synthetic data. So there's two ways to look at this, right? So one is the way in which synthetic data is being used today, which is not to increase the volume of training data, but it's actually to overcome limitations in the quality of the training data that we do have.
So for instance, if in a particular language, there's too little data, you can try to augment that, or you can try to have a model, you know, solve a bunch of mathematical equations, throw that into the training data. And so for the next training run, that's going to be part of the pre training. And so the model will get better at doing that.
And the other way to look at synthetic data is, okay, you take 1 trillion tokens, you train a model on it, and then you output 10 trillion tokens, so you get to the next bigger model, and then you use that to output 100 trillion tokens. I'll bet that that's just not going to happen. That's just a snake eating its own tail, and...
What we've learned in the last two years is that the quality of data matters a lot more than the quantity of data. So if you're using synthetic data to try to augment the quantity, I think it's just coming at the expense of quality. You're not learning new things from the data. You're only learning things that are already there.
While we're on utility value of data, when we look at effectiveness of agents, I've had Alex Wang at Scale.ai on the show, and he said the hardest thing about building effective agents is most of the work that one does in an organization, you don't actually codify down in data. You remember when you were at school and it says, show your thinking or show your work.
You don't do that in an organization. You draw on the whiteboard, you map it out, and then you put down what you think in the document. The whiteboard is often not correlated in a data source. To what extent do we have the data of showing your work for models, agents to actually do in a modern enterprise?
Yeah, I think that's really spot on. I think one way in which people's intuitions have been kind of misguided by the rapid improvements in LLMs is that all of this has been in the paradigm of learning from data on the web that's already there. And once that runs out, you have to switch to new kinds of learning, analog of riding a bike. That's just kind of tacit knowledge.
It's not something that's been written down. So a lot of what happens in organizations is the cognitive equivalent of I think what happens in the physical skill of riding a bike.
And I think for models to learn a lot of these diverse kinds of tasks that they're not going to pick up from the web, you have to have the cycle of actually using the AI system in your organization and for it to learn from that back and forth experience instead of just passively ingesting.
To what extent do you think enterprises today are willing to let passive AI products into their enterprises to observe, to learn, to test? And is there really that willingness, do you think?
It's got to be more than passive observation. You have to actually deploy AI to be able to get to certain types of learning. And I think that's going to be very slow. And I think a good analogy is self-driving cars, of which we had prototypes two or three decades ago.
But for these things to actually be deployed, you have to roll it out on slightly larger and larger scales while you collect data, while you make sure you get to the next nine of reliability, four nines of reliability to five nines of reliability. So it's that very slow rollout process. It's a very slow feedback loop.
And I think that's going to happen with a lot of AI deployment and organizations as well.
You said about smaller models. Help me just understand again. I'm sorry. The show is very successful, Arvind, because I think I asked the questions that everyone asked, but they're too afraid to actually admit they don't know the answers to. Why are we seeing this trend towards smaller models?
And why do we think that is the most likely outcome in the model landscape to have a world of many smaller models?
Yeah, thank you for asking that. That's not obvious at all. My view is that in a lot of cases, the adoption of these models is not bottlenecked by capability. If these models were actually deployed today to do all the tasks that they're capable of, it would truly be a striking economic transformation. The bottlenecks are things other than capability. And one of the big ones is cost.
And cost, of course, is roughly proportional to the size of the model. And that's putting a lot of downward pressure on model size.
And once you get a model small enough that you can run it on device, that of course opens up a lot of new possibilities, both in terms of privacy, you know, people are much more comfortable with on device models, especially if it's something that's going to be listening to their phone conversations or looking at their desktop screenshots, which are exactly the kinds of AI assistance that companies are building and pushing.
And just from the perspective of cost, you don't have to dedicate servers to run that model. So I think those are a lot of the reasons why companies are furiously working on making models smaller without a big hit in capability.
Will Moore's law not mean cost goes down dramatically in actually a relatively short three to five year period?
You're right. Cost is going down dramatically. In certain applications, cost is going to become much less of a barrier, but not across the board.
Where does it become a barrier and where does it not?
So there's this interesting concept called Jevons Paradox. And this was first in the context of coal in England in the 18th century. I think when coal mining got cheaper, there was more demand for coal. And so the amount invested into coal mining actually increased. And I predict that we're going to see the same thing with models. When models get cheaper, they're put into a lot more things.
And so the total amount that companies are spending on inference is actually going to increase. In an application like a chatbot, let's say, you know, it's text in, text out, no big deal. I think costs are going to come down. Even if someone is chatting with a chatbot all day, it's probably not going to get too expensive.
On the other hand, if you want to scan all of someone's emails, for instance, right? If a model gets cheaper, you know, you're just going to have it running always on in the background. And then from emails, you're going to get to all their documents, right? And some of those attachments might be many megabytes long.
And so there, even with Moore's law, I think cost is going to be significant in the medium term. And then you get to applications like writing code, where what we're seeing is that it's actually very beneficial to let the model do the same task tens of times, thousands of times, sometimes literally millions of times and pick the best answer.
So in those cases, it doesn't matter how much cost goes down. You're going to just proportionally increase the number of retries so that you can get a better quality of output.
So we have smaller models, and they're effective, as we said, because of cost, and they're popular because of cost. What does that do to the requirements in terms of compute?
So there is training compute, which is when the developer is building the model. And then there is inference compute, when the model is being deployed and the user is using it to do something. And it might seem like really the training cost is the one we should worry about, since it's trained on all of the text on the internet or whatever.
But it turns out that over the lifetime of a model, when you have billions of people using it, the inference cost actually adds up. And for many of the popular models, that's the cost that dominates. So let's talk about each of those two costs.
With respect to training costs, if you want to build a smaller model at the same level of capability or without compromising capability too much, you have to actually train it for longer. So that increases training costs. But that's maybe okay because you have a smaller model. You can push it to the consumer device or even if it's running on the cloud, your server costs are lower.
So your training cost increases, your inference cost decreases. But because it's the inference cost that dominates, the total cost is probably going to come down. So total cost comes down. If you have the same workload and you have a smaller model doing it, then the total cost is going to come down.
When we think about the alignment in compute and models, we had David Kahn from Sequoia on the show and he said that you would never train a frontier model on the same data center twice. Meaning that essentially there is now a misalignment in the development speed of models and that is much faster than the development speed of new hardware and compute. How do you think about that?
So we are releasing new models so fast that computers are unable to keep up with them. And as a result, you won't want to train your new model on old H100 hardware that is 18 months old. You need continuously the newest hardware for every single new frontier model.
Sure. I think we are still in a period where, you know, these models have not yet quite become commoditized. There's obviously a lot of progress and there's a lot of demand on hardware as well. Hardware cycles are also improving rapidly. But, you know, there's the saying that every exponential is a sigmoid in disguise. So a sigmoid curve is one that looks like an exponential at the beginning.
So imagine the S letter shape. But then after a while, it has to taper off like every exponential has to taper off. So I think that's going to happen both with models as well as with these hardware cycles. We are, I think, going to get to a world where models do get commoditized.
Speaking of that commoditization, the thing that I'm interested by there is kind of the benchmarking or the determination that they are suddenly commoditized or kind of equal performance. You said before LLM evaluation is a minefield. Help me understand why is LLM evaluation a minefield?
A big part of it is this issue of vibes, right? So you evaluate LLMs on these benchmarks, but then it seems to perform really well on the benchmarks, but then the vibes are off. In other words, you start using it and somehow it doesn't feel adequate. It makes a lot of mistakes in ways that are not captured in the benchmark.
And the reason for that is simply that when there is so much pressure to do well on these benchmarks, developers are intentionally or unintentionally optimizing these models in ways that look good on the benchmarks, but don't look good in real world evaluation.
So when GPT-4 came out and OpenAI claimed that it passed the bar exam and the medical licensing exam, people were very excited slash scared about what this means for doctors and lawyers. And the answer turned out to be approximately nothing. Because it's not like a lawyer's job is to answer bar exam questions all day.
These benchmarks that models are being tested on don't really capture what we would use them for in the real world. So that's one reason why LLM evaluation is a minefield. And there's also just a very simple factor of contamination. Maybe the model has already trained on the answers that it's being evaluated on in the benchmark. And so if you ask it new questions, it's going to struggle.
We shouldn't put too much stock into benchmarks. We should look at people... We're actually trying to use these in professional context, whether it's lawyers or, you know, really anybody else. And we should go based on their experience of using these AI assistants.
We mentioned that, you know, some of the early use cases in terms of passing the bar, some real kind of wild applications in terms of how models are applied. I do just want to kind of move a layer deeper to the companies building the products and the leaders leading those companies. You've got Zach and Demis who are saying that AGI is further out than we think.
And then you have Sam Altman and you have Dario and Elon in some cases saying it's sooner than we think. What are your reflections and analysis on company leader predictions on AGI?
So let's talk for a second about what AGI is. Different people mean different things by it and so often talk past each other. The definition that we consider most relevant is AI that is capable of automating most economically valuable tasks. By this definition, you know, of automating most economically valuable tasks, if we did have AGI, that would truly be a profound thing in our society.
So now for the CEO predictions, I think one thing that's helpful to keep in mind is that there have been these predictions of imminent AGI since the earliest days of AI for more than a half century. Alan Turing. When the first computers were built or about to be built, people thought, you know, the two main things we need for AI are hardware and software. We've done the hard part, the hardware.
Now there's just one thing left, the easy part, the software. But of course, now we know how hard that is. So I think historically what we've seen, it's kind of like climbing a mountain. Wherever you are, it looks like there's just kind of one step to go. But when you climb up a little bit further, the complexity reveals itself. And so we've seen that over and over and over again.
Now it's like, oh, you know, we just need to make these bigger and bigger models. So you have some silly projections based on that. But soon the limitations of that started becoming apparent. And now the next layer of complexity reveals itself. So that's my view. I wouldn't put too much stock into these overconfident predictions from CEOs.
Is it possible to have a dual strategy of chasing AGI and superintelligence, as OpenAI very clearly are, and creating valuable products at the same time that can be used in everyday use? Or is that balance actually mutually exclusive?
I certainly think the balance is possible. To some extent, every big company does this.
If I push you, if you think about your priority, your priority at OpenAI is, say, achieving superintelligence and AGI. Their best researchers, their best developers, the core of their budgets will go to that. When you have dual priorities, one takes the priority. And so there is that conflict.
That's fair. And I think, you know, it would take a discipline from a management to be able to pull it off in a way that one part of the company doesn't distract another too much. And we've seen this happen with OpenAI, which is the folks focused on superintelligence didn't feel very welcome at the company.
And there has been an exodus of very prominent people and Anthropic has picked up a lot of them. So it seems like we're seeing a split emerging where OpenAI is more focused on products and Anthropic is more focused on superintelligence. While I can see the practical reasons why that is happening, I don't think it's impossible to have disciplines management that focuses on both objectives.
What did you mean when you said to me that AI companies should pivot from creating gods to building products?
In the past, they didn't have this balance. They were so enamored by this prospect of creating AGI that they didn't think there was a need to build products at all. And the craziest example for me is when OpenAI put out ChatGPT, there was no mobile app for six months. And the Android app took even longer than that.
You know, there was this assumption that ChatGPT was just going to be this kind of really demo to show off the capabilities of the models. And OpenAI was, you know, in the business of building these models and third party developers would take the API and put it into products. But really, AGI was coming so quickly, even the notion of productization seemed obsolete.
This was, you know, I'm not trying to put words in anyone's mouth, but this was kind of a coherent, but in my view, incorrect philosophy that I think a lot of AI developers had. And I think that has changed quite a bit now. And I think that's a good thing. So if they had to pick one, I think they should pick building products.
But it certainly doesn't make sense for a company to be just an AGI company and not try to build products, not try to build something that people want. And just assuming that AI is going to be so general, that it's just going to, you know, do everything that people want, and that the company doesn't actually need to make products.
Do you think it's even possible for companies to compete in any level of AGI pursuit? When you look at the players and the cash that they're willing to spend, you know, Zuck has committed $50 billion over the next three years. When you look at how much OpenAI has raised over the last three years and they carry on that run rate, it's something crazy like that.
It'd still be $38 billion short of a Zuck spend over a three-year period. Can you create AGI-like products or God-like products unless you are Google, Amazon, Apple, or Facebook?
So I don't know is the short answer. But at the same time, you know, we've been in this kind of historically interesting period where a lot of progress has come from building bigger and bigger models that need not continue in the future. It might. Or what might happen is that the models themselves get commoditized and a lot of the interesting development happens in a layer above the models.
We're starting to see a lot of that happen now with AI agents. And if that's the case, great ideas could come from anywhere, right? It could come from a two-person startup. It could come from an academic lab. And my hope is that we will transition to that kind of mode of progress in AI development relatively soon.
With the commoditization of those models and the appreciation that value can be built on top of them, does that not go back to what I said, though, which is really there is three to four core models which are financed by cash cow cloud businesses. You know, the obvious says Amazon, there's Google. And then for Facebook, there's obviously Instagram and News Feed.
And there are three large model providers which sit as the foundational model there. And then every bit of value is built on top of them.
I think that's a very serious possibility. And I think this is actually one area where regulators should be paying attention. You know, what does this mean for market concentration, antitrust, and so forth. And I've been gratified that these are topics that, at least in my experience, US regulators are considering.
And I believe in the UK, the CMA, the Competition and Markets Authority as well, and certainly in the EU. So yeah, in many jurisdictions, now that I think about it, this is something that regulators have been worried about.
If you were, as you said at the beginning about kind of your work on policy, you have US regulators and European regulators, what would you put forward as the most proactive and effective policy for US and European regulation around AI and models?
So in a sense, AI regulation is a misnomer. Let me give you an example from just this morning. The FTC has been worried about the Federal Trade Commission in the US, which is an antitrust and consumer protection authority, has been worried about people writing fake reviews for their products. And this has, of course, been a problem for many years. It's become a lot easier to do that with AI.
So now someone who thinks about this in terms of AI regulation might say, oh, you know, regulators have to ensure that AI companies don't allow their products to be used for generating fake reviews. And I think this is a losing proposition. Like how would an AI model know whether something is a fake review or a real review, right? It just depends on who's writing the review.
But instead, that's not the approach that the FTC took. They recognized correctly that it's a problem whether AI is generating the fake review or people are. So what they actually banned is fake reviews. And so what is often thought of as AI regulation is better understood as regulating certain harmful activities, whether or not AI is used as a tool for doing those harmful activities.
80% of what gets called AI regulation is better seen this way.
When I had Ethan Mollick on from Wharton, he was like, you know, the best thing to do actually is like a allow and watch a policy. He had a much more academic approach to it in terms of naming than that with, you know, wonderful principle from some ancient learning professor.
But he said essentially we should let everything flourish and then regulate from there rather than proactively regulate ahead of time, not knowing outcomes. Does that ring true to you?
I broadly agree with that. I will add a couple of additions to that. One is there are many kinds of harms, which we already know about and are quite serious.
So the use of AI to make non-consensual deep fakes, for instance, deep fake nudes, and this has affected, you know, thousands, perhaps hundreds of thousands of people, primarily women around the world and governments are taking action now, finally. So that's a good thing.
Just on the verification side there, and kind of you said about the deep fakes, I think it was Syash said on Twitter recently with a great highlighting that the biggest danger of AI to him was actually not that we would believe fake news. It was that we would start to distrust and not believe real news.
I agree. So we call this the liar's dividend. People have been worried, for instance, about bots creating misinformation with AI and influence in elections and that sort of thing. We're very, very skeptical that that's going to be a real danger.
How are you not skeptical that that's a real danger? We're a media company. We have amazing media people who use AI every day. We could create some terrible things with AI today, people would believe.
But you could have created those things without AI.
We could not have created Trump fakes with his voice declaring war on China. I could do a fake show with Trump today and release it and pretend that it's real and have him declare war on China.
Yeah, I think that's fair. But I think the reason that might fool a lot of people is because it came from a legitimate media company. So I think the ability to do this, you know, emphasizes some of the things that have always been important, but have now become more important, like source credibility.
Do we not see then in that world that actually a lot more value accrues to significant mainstream media outlets who are verified and have brand validity already?
That actually is our prediction. People we predict are going to be forced to rely much more on getting their news from trusted sources.
Does that worry you? Like, I understand, but sadly, I don't think people are always as smart as we give them credit for. And when you look at the spread of misinformation, and when you look at the willingness to accept misinformation from large swathes of the population, A tweet with an AI-generated picture with whatever it could be in there can create such societal damage.
This is really worrying.
So misinformation is a problem. In a way, I think misinformation is more of a symptom than a cause. Misinformation slots into and affirms people's existing beliefs as opposed to changing their beliefs. And I think the impact on AI here, again, has been tremendously exaggerated. Sure, you can create a Trump deepfake like you were talking about.
But when you look at the misinformation that's actually out there, it's things that are as crude as video game footage. Because again, it's telling people what they want to believe in a situation where they're not very skeptical.
You said there about kind of confirming existing beliefs. Does that distinction matter though? Because actually you could have someone who is naturally, you know, we've had riots in the UK in recent, in the last few weeks.
And actually you could have AI generated images with many more migrants or many more rioters than there actually are with the incitement that you should join because this is happening. And the confirmation material, which is that AI-generated material, leads to action to take place. It doesn't actually matter. The point is it incites action.
For sure, yeah. But I want to push on, you know, is this really an AI problem? These are, you know, deep problems in our society. So creating an image, you know, that looks like there were a lot more people there than there were. Yeah, it's become easier to do that with AI today. But you could have paid someone $100 to do that with Photoshop, you know, even before AI. It's a problem we've had.
It's a problem we have been dealing with, often not very successfully. My worry is that if we treat this as a technology problem and try to intervene on the technology, we're going to miss what the real issues are and the hard things that we need to be doing to tackle those issues, which are, you know, which relate to issues of trust in society.
And to the extent it's a technology problem, it's more of a social media problem, really, than an AI problem, because the hard part of misinformation is not generating it, it's distributing it to people and persuading them. And social media is often the medium for that. And so I think there should be more responsibility placed on social media companies.
And my worry is that treating this as an AI problem is distracting from all of those more important interventions.
So are social media companies, aka the distribution platforms, are they the arbiters of justice on what is a malicious AI image versus what isn't?
Yeah, I think the primary control is being exercised today by social media companies.
I feel like I worry more about this than you on the content misinformation side. And so I'm intrigued on the concerns that you have. What would you say is a more pressing concern for you?
So when we were talking about deepfakes, I'm much less worried about misinformation deepfakes and more worried about deepfake nudes that I was talking about, right? So those are things that can destroy a person's life. It's been shocking to me how little attention this got from the press and from policymakers until it happened to Taylor Swift. a few months ago. And then it got a lot of attention.
So there were deep fake nudes of Taylor Swift posted on Twitter slash x. And after that, you know, policymakers started paying attention. But it has been happening for many years now, even before the latest wave of generative AI tools. So that's the type of misuse that is very clear.
And then there are other kinds of misuses that are not necessarily dangerous in the same way, but impose a lot of costs on society.
So when students are using AI to do their homework, for instance, now high school teachers and college teachers everywhere have to revamp how they're teaching in order to account for the fact that students are doing this and there is no way really to catch AI generated text or homework answers. And so these are costs upon society. I'm not saying that the availability of AI makes education worse.
I don't think that's necessarily the case. But it forces a lot of costs upon the education system. And ideally, AI companies should be bearing some of that cost.
One thing I think we're just so far off on, Arvind, is medical doctors. Everyone's like, you're going to have a GP in your pocket with AI. Are you high? GPs feel your elbow. They look at x-rays, look inside your ear and see very specific things. They look up your nose. You're not going to shove your smartphone up your nostril. You know, it can't feel your arm.
Can you help me understand why I'm wrong and why AI will revolutionize medical with a GP in everyone's pocket?
Sure. So I don't think you're wrong. I think the reason there is a lot of talk about this is it goes back to something we've observed over and over, which is that when there are problems with an institution like the medical system, Right. Like the wait times are too long or it's too costly or in a lot of countries, you know, people don't even have access.
You know, in developing countries, there might be entire villages with no physician. Then this kind of technological bandaid becomes very appealing. So I think that's what's going on here. I think the responsible way to use in medicine is for it to be integrated into the medical system. And actually, the medical system has been a very enthusiastic adopter of technology, including AI.
So you can consider CAT scans, for instance, to be a form of AI to be able to reconstruct what's going on inside a person based on certain imaging. And now with generative AI as well, there's a lot of interest from the medical system in figuring out, can this be useful for diagnosis or for more mundane things like summarizing medical notes and so forth. So I think that work is really important.
I think that should continue. It still does leave us with the harder question of, you know, here in America, you know, if it takes me three weeks to get a GP appointment, it's very tempting to ask chat GPT a question about my symptoms. So what do we do about that? You know, is that can that actually be helpful with appropriate guardrails? Or should that be discouraged?
I don't know the answer to that.
I'm glad that I'm not alone in my skepticism there. Because applied to education, again, everyone says it's amazing you have a tutor in your pocket. Yeah. We do also have your videos that we can watch at home. A tutor has personal relationships. It's one-to-one where I want to impress you of, and I have that personal desire to fulfill abilities, potentials that doesn't have.
How do you think AI impacts the future of education, one-on-one tuition, and that up-leveling of students?
I think there's different populations of students. There's a small subset of learners who are very self-motivated and will learn very well, even if there's no physical tutor. There are those kinds of learners at all different levels. And then there's the vast majority of learners for whom the social aspect of learning is really the most critical thing.
And if you take that away, they're just not going to be able to learn very well. And I think this is often forgotten, especially because in the AI developer community, there are a lot of these self-taught learners. I'm among them, right? I just paid zero attention throughout school and college and everything that I know literally is stuff that I taught myself. So I grew up in India.
The education system wasn't very great there. Our geography teacher thought that India was in the southern hemisphere. True story. Right. So again, I literally mean it when I say everything that I know I taught myself. And so, you know, you have a lot of AI developers who are thinking of themselves as the typical learner, and they're not.
And I think for someone like me, AI is on a daily basis, an incredible tool for learning. I use generative AI tools for learning. It's a new way of learning compared to a book or really anything else. You know, I can't summarize my understanding of my topic to a book and ask it if I'm right. These are things I can do with AI.
But I'm very skeptical that these new kinds of learning are going to get to a point anytime soon where they're going to become the default way in which people learn.
Do you think people dramatically overestimate the fear of job replacement? We always see job replacement with any new technology and then it tends to create a lot more jobs than it previously were. Do you think that is the case here or do you think job replacement fears are justified?
I think for now, they are very much overblown. My favorite example of the thing you said of technology creating jobs is bank tellers. When ATMs became a thing, it would have been reasonable to assume that bank tellers were just going to go away. But in fact, the number of tellers increased. And the reason for that is that it became much cheaper for banks to open regional branches.
And once they did open those regional branches, they did need humans for some of the things that you couldn't do with an ATM. And, you know, the more abstract way of saying that is, as economists would put it, jobs are bundles of tasks, and AI automates tasks, not jobs.
So if there are, you know, 20 different tasks that comprise a job, the odds that AI is going to be able to automate all 20 of them are pretty low. And so there are some occupations certainly that have already been affected a lot by AI like translation or stock photography. But for most jobs out there, I don't think we're anywhere close to that.
Another one that does worry me is actually defense. We had Alex Wang from ScaleOn I mentioned earlier. He said that AI has the potential to be a bigger weapon than nuclear weapons. How do you think about that? And if that is the case, should we really have open models?
So I think it's a good question to ask. I think it's a bit of a category error there. I mean, a nuclear weapon is an actual weapon. AI is not a weapon. AI is something that, you know, might enable adversaries to do certain things more effectively. For example, find vulnerabilities, cybersecurity vulnerabilities in critical infrastructure, right?
So that's one way in which AI could be used on the quote unquote battlefield. So that being the case, I think it would be a big mistake to view it analogously to a weapon and to argue that it should be closed up for a couple of reasons. First of all, it's not going to work at all. So I think we have close to state of the art AI models that can already run on people's personal devices.
And I think that trend is only going to accelerate. We talked earlier about Moore's law, and it still continues to apply to these models. And even if one country decides that models should be closed, the odds of getting every country to enact that kind of rule are just vanishingly small.
So if our approach to safety with AI is going to be premised on ensuring that, quote unquote, bad guys don't get access to it, we've already lost. because it's only a matter of time before it becomes impossible to do that.
And instead, I think we should radically embrace the opposite, which is to figure out how we're going to use AI for safety in a world where AI is very widely available, because it is going to be widely available. And when we look at how we've done that in the past, it's actually a very reassuring story.
When we go back to the cybersecurity example, for 10 or 20 years, the software development community has been using automated tools, some of which you could call AI, to improve cybersecurity because software developers can use them to find bugs and fix bugs in software before they put them out there, before hackers even have a chance to take a crack at them.
So my hope is that the same thing is going to happen with AI. We're going to be able to acknowledge the fact that it's going to be widely available and to shape its use for defense more than offense.
Arvind, what did you believe about the developments that we've seen in AI over the last two years that you now no longer believe?
Like a lot of people, I was fooled by how quickly after GPD 3.5, GPD 4 came out. It was just three months or so, but it had been in training for 18 months. That was only revealed later. So it gave a lot of people, including me, an inflated idea of how quickly AI was progressing.
And what we've seen in the nearly year and a half since GPD 4 came out is that we haven't really had models that have surpassed it in a meaningful way. And this is not based on benchmarks. Again, I think benchmarks are not that useful. It's more based on vibes. When you get people using these things, what do they say? I don't think models have really qualitatively improved on GPT-4.
And I don't think things are moving as quickly as I did 12 months ago.
And the reasons for that lack of progression, sorry.
Making models bigger and bigger doesn't seem to be working anymore. I think new developments have to come from different scientific ideas. Maybe it's agents, maybe it's something else.
What do you think society's biggest misconception of AI is today?
I think our intuitions are too powerfully shaped by sci-fi portrayals of AI. And I think that's really a big problem. This idea that AI can become self-aware. When we look at the way that AI is architected today, that kind of fear has no basis in reality. Maybe one day in the future, people are going to build AI systems where that becomes at least somewhat possible. And we should have...
visibility, transparency, monitoring regulation around these systems to make sure that developers don't. But that would be a choice. That's a choice that society can make, that governments and companies can make. It's not that despite our best efforts, AI is going to become conscious and have agency and do things that are harmful to humanity.
That whole line of fear, I think, is completely unfounded.
I'd love to do a quick fire round. I could talk to you all day, but I'd love to give you a quick fire round. So I say a short statement, you give me your immediate thoughts. Does that sound okay? Let's do it. Why are AI leaderboards no longer useful?
Because the gap between benchmarks and the real world is big and it's only growing bigger. As AI becomes more useful, it's harder to figure out how useful it is based on these artificial environments.
If you were CEO of OpenAI for a day, what would you do?
I would resign. I don't think I would be a good CEO. But if there were one thing I could change about OpenAI, I think the need for the public to know what is going on with AI development overrides the commercial interests of any company. So I think there needs to be a lot more transparency.
What is your vision for the future of agents?
So my hope is that the kind of thing we saw in the movie Her, not the sci-fi aspects of it, but the more kind of mundane aspects of it where you give your device a command and it interprets it in a pretty nuanced way and does what you want it to do, right? Book flight tickets, for instance, or really build an app based on what you want it to look like.
So these are things that are potentially automatable, don't have like massively dubious societal consequences. Those are the things that I hope can happen.
Will companies increasingly move into the chip and compute layer and compete with NVIDIA? Or do you think it will be a continuous NVIDIA monopoly, all of them buying from NVIDIA?
I do find it interesting that NVIDIA itself has been trying to migrate really, really hard out of hardware into becoming a services company.
And everyone tries to migrate into that business. Why is tech policy frustrating 90% of the time?
A lot of technologists kind of have a disdain for policy. They see policymakers as, well, morons, to put it bluntly. But I don't think that's the case. I think there are a lot of legitimate reasons why policy is very slow and doesn't often go in the way that a tech expert might want it to. And that's the 90% frustration. And the reason I say it's only 90% is that the other 10% is really worth it.
We really need policy. And despite how frustrating it is, we need a lot of tech experts in policy.
Jan LeCun or Jeff Hinton, which side are you on?
I have to say, I really like Jan LeCun's perspectives on various things, including his view that LLMs are, quote unquote, off-ramp to superintelligence that, you know, in other words, we need a lot more scientific breakthroughs, as well as tamping down the fears of super advanced AI.
What question are you never asked that you should be asked?
It's weird for me to be saying this, but I have to say, think of the children. I'm never asked this because, and what I mean by that is that AI, the role of AI in kids' lives, kids who are born today, for instance, is going to be so profound. And it's something that technologists should be thinking about. Every parent should be thinking about.
Policymakers should be thinking about because it can be profoundly good or profoundly bad or anything in between. And both as a technologist and as a parent, I think about that a lot.
Listen, Arvind, as I said, I've loved your writing. I can't wait for the book. Thank you so much for putting up with my deviating questions, but I've so enjoyed having you on the show.
This has been really, really fun. I apologize for rambling occasionally, but I hope that it's, yeah, I'm really looking forward to hearing it when it's out there.
I have to say, I so enjoyed that show with Arvind. I think it's so cool that a podcast can get you into the same room as professors from incredible universities like Princeton, like Carnegie Mellon. It's so cool. I love doing this episode. Thank you so much again to Arvind for being so great. And I really hope you enjoyed it.
But before we leave you today, all of you listening use tons of software every day. Sometimes it fills us with rage. You can't figure something out. The chatbot in the bottom right is useless. You keep getting bombarded with these useless pop-ups. And for those of you who build products, no one wants their product to feel like this. Thankfully, a company exists to help users without annoying them.
CommandBar. It does a couple of very helpful things. First, it's a chatbot that uses AI to give users extremely personalized responses and deflect tickets. But it can be beyond just text. It can also co-browse with the user and show them how to do things inside the UI. Magic.
But it can also detect when users would benefit from a proactive nudge, like a helpful hint, or an invitation to start a free trial. CommandBar is already used by world-class companies like Gusto, HashiCorp, Yotpo, and Angelist. If you're a product CX or marketing leader, check them out at commandbar.com slash harry.
And talking about incredible companies with Command Bar, I want to talk to you about a new venture fund making waves by taking a very different approach. It's a public venture fund anyone can invest in, not just institutions and accredited investors. The Fundrise Innovation Fund is democratizing venture capital, which could have big consequences for the industry.
The fund is already off to a good start with $100 million into some of the largest, most in-demand AI and data infrastructure companies. Companies like OpenAI, Anthropic, and Databricks. Check out the Innovation Fund's impressive list of investments for yourself by visiting fundrise.com slash 20VC.
Carefully consider the investment material before investing, including objectives, risks, charges and expenses. This and other information can be found in the Innovation Fund's prospectus at fundrise.com slash innovation. This is a paid sponsorship. And finally, let's talk about Squarespace. Squarespace is the all-in-one website platform for entrepreneurs to stand out and succeed online.
Whether you're just starting out or managing a growing brand, Squarespace makes it easy to create a beautiful website, engage with your audience, and sell anything from products to content, all in one place, all on your terms. What's blown me away is the Squarespace Blueprint AI and SEO tools.
It's like crafting your site with a guided system, ensuring it not only reflects your unique style, but also ranks well on search engines. Plus, their flexible payment options cater to every customer's needs, making transactions smooth and hassle-free. And the Squarespace AI? It's a content wizard helping you whip up text that truly resonates with your brand voice.
So if you're ready to get started, head to squarespace.com for a free trial. And when you're ready to launch, go to squarespace.com slash 20VC and use the code 20VC to save 10% off your first purchase of a website or domain.
As always, I so appreciate all your support and stay tuned for an incredible episode of 20 Growth coming on Friday with Scott Gawlik on the early days of Uber's scaling.