Dwarkesh Patel
👤 PersonPodcast Appearances
Today, I'm speaking with Andrei Karpathy.
Andrei, why do you say that this will be the decade of agents and not the year of agents?
Interesting.
So, as a professional podcaster and a…
A viewer of AI from afar.
It's easy to identify for me, like, oh, here's what's lacking.
Continual learning is lacking or multimodality is lacking.
But I don't really have a good way of trying to put a timeline on it.
Like, if somebody's like, how long will continual learning take?
There's no like prior I have about like, this is a project that should take five years, 10 years, 50 years.
Why a decade?
Why not one year?
Why not 50 years?
This is actually quite interesting.
I want to hear not only the history, but what people in the room felt was about to happen at various different levels.
breakthrough moments, what were the ways in which their feelings were either overly pessimistic or overly optimistic?
Should we just go through each of them one by one?
Interesting.
Yeah, I guess if I were to steel man the sort of sudden perspective would be that humans actually can just take on everything at once, right?
Even animals can take on everything at once, right?
Animals are maybe a better example because they don't even have the scaffold of language.
They just get thrown out into the world and they just have to make sense of everything without any labels.
And the vision for AGI then should just be something which just looks at sensory data, looks at the computer screen, and it just figures out what's going on from scratch.
I mean, if a human was put in a similar situation, that would be trained from scratch.
But I mean, this is like a human growing up or an animal growing up.
So why shouldn't that be the vision for AI rather than this thing where we're doing millions of years of training?
I think that's a really good question.
Can you repeat the last sentence?
A lot of that intelligence is not motor tasks, that's what, sorry?
I'm going to take a second to digest that because there's a lot of different ideas.
Maybe one clarifying question I can ask to understand the perspective.
So I think you suggest that, look, evolution is doing the kind of thing that pre-training does in the sense of building something which can then understand the world.
The difference, I guess, is that evolution...
has to be titrated in the case of humans through three gigabytes of DNA.
And so that's very unlike the weights of a model.
I mean, literally the weights of the model are a brain, which obviously is not encoded in the sperm and the egg, or does not exist in the sperm and the egg.
So it has to be grown.
And also the information for every single synapse in the brain simply cannot exist in the three gigabytes that exist in the DNA.
Evolution seems closer to finding the algorithm
which then does the lifetime learning.
Now, maybe the lifetime learning is not analogous to RL, to your point.
Is that compatible with the thing you were saying, or would you disagree with that?
Just to steel man the other perspective, because after doing this in an interview and thinking about it a bit, he has an important point here.
Evolution does not give us the knowledge, really, right?
It gives us the algorithm to find the knowledge.
And that seems different from pre-training.
So if perhaps the perspective is that pre-training helps build the kind of entity which can learn better, it teaches meta-learning.
and therefore it is similar to like finding an algorithm.
But if it's like evolution gives us knowledge and pre-training gives us knowledge, that analogy seems to break down.
There's so much interesting stuff there.
Okay, so let's start with in-context learning.
This is an obvious point, but I think it's worth just like saying it explicitly and meditating on it.
The situation in which these models seem the most intelligent, in which they are like, I talk to them and I'm like, wow, there's really something on the other end that's responding to me thinking about things.
If it like makes a mistake, it's like, oh, wait, that's actually the wrong way to think about it.
I'm backing up.
All that is happening in context.
That's where I feel like the real intelligence you can like visibly see.
And that in context learning process is developed by gradient descent on pre-training, right?
Like it spontaneously meta learns in context learning.
But the in context learning itself is not gradient descent in the same way that our lifetime intelligence as humans to be able to do things is conditioned by evolution.
But our actual learning during our lifetime is like happening through some other process.
I actually don't fully agree with that, but you should continue with that.
Actually, then I'm very curious to understand how that analogy breaks down.
So then it's worth thinking about, okay, if both of them are implementing gradient descent, sorry, if in-context learning and pre-training are both implementing something like gradient descent, why does it feel like in-context learning actually we're getting to this like continual learning, real intelligence-like thing, whereas you don't get the analogous feeling just from pre-training?
At least you could argue that.
And so if it's the same algorithm, what could be different?
Well, one way you can think about it is how much information
does the model store per information it receives from training?
And if you look at pre-training, if you look at Llama 3, for example, I think it's trained on 15 trillion tokens.
And if you look at a 70B model, that would be the equivalent of 0.07 bits per token in that it sees in pre-training in terms of the information in the weights of the model compared to the tokens it reads.
Whereas if you look at the KV cache and how it grows per additional token and in-context learning, it's like 320 kilobytes.
So that's a 35 million fold difference in how much information per token is assimilated by the model.
I wonder if that's relevant at all.
Stepping back, what is the part about human intelligence that we have most failed to replicate with these models?
This is maybe relevant to the question of thinking about how fast these issues will be solved.
So sometimes people will say about continual learning, look, actually, you could easily replicate this capability just as in-context learning emerged spontaneously as a result of pre-training.
Continual learning over longer horizons will emerge spontaneously if the model is incentivized to recollect information over longer horizons or horizons longer than one session.
So if there's some like outer loop RL, which...
it has many sessions within that outer loop, then like this continual learning where it uses like, it fine tunes itself or it writes to an external memory or something will just sort of like emerge spontaneously.
Do you think, do you think things are things that are plausible?
I just, I don't have really a prior over like how plausible is that?
How likely is that to happen?
Interesting.
In 10 years, do you think it'll still be something like a transformer, but with a much more modified attention and more sparse MLPs and so forth?
It's surprising that all of those things together are...
only halved half of the error, which is like 30 years of progress.
Maybe half is a lot, because if you halve the error, that actually means that... Half is a lot, yeah.
Yeah, actually, I was about to ask a very similar question about NanoChat.
Because since you just coded up recently, every single sort of step in the process of building a chatbot is like fresh in your RAM.
And I'm curious if you had similar thoughts about like, oh, there was no one thing that was relevant to going from...
GPT-2 to NanoChat.
What are sort of like surprising takeaways from the experience?
What is the best way for somebody to learn from it?
Is it just like delete all the code and try to re-implement from scratch, try to add modifications to it?
Yeah, I think that's a great question.
Interesting.
You tweeted out that coding models were actually of very little help to you in assembling this repository.
And I'm curious why that was.
And they just couldn't internalize that you had your own?
The reason I think this question is so interesting is because the main story people have about AI exploding and getting to superintelligence pretty rapidly is AI automating, AI engineering, and AI research.
And so they'll look at the fact that you can have Cloud Code make entire application, CRUD applications from scratch and be like, if you had this same capability inside of OpenAI and DeepMind and everything, well, just imagine the level of like just, you know, a thousand of you or a million of you in parallel finding little architectural tweaks.
And so it's quite interesting to hear you say that this is the thing they're sort of asymmetrically worse at.
And it's like quite relevant to forecasting whether the AI 2027 type explosion is likely to happen anytime soon.
I think that's a good way of putting it.
Very naive question, but the architectural tweaks that you're adding to NanoChat, they're in a paper somewhere, right?
They might even be in a repo somewhere.
So is it surprising that they aren't able to integrate that into whenever you're like add rope embeddings or something, they do that in the wrong way?
Yeah.
Actually, here's another reason why this is really interesting.
Through the history of programming, there's been many productivity improvements, compilers, linting, better programming languages, etc.,
which have increased programmer productivity, but have not led to an explosion.
So that sounds very much like autocomplete tab.
And this other category is just like automation of the programmer.
And so it's interesting you're seeing more in the category of the historical analogies of better compilers or something.
One of the big problems with RL is that it's incredibly information sparse.
LabelBox can help you with this by increasing the amount of information that your agent gets to learn from with every single episode.
For example, one of their customers wanted to train a coding agent.
So LabelBox augmented an IDE with a bunch of extra data collection tools and staffed a team of expert software engineers from their aligner network to generate trajectories that were optimized for training.
Now, obviously these engineers evaluated these interactions on a pass-fail basis, but they also rated every single response on a bunch of different dimensions like readability and performance.
And they wrote down their thought processes for every single rating that they gave.
So you're basically showing every single step an engineer takes and every single thought that they have while they're doing their job.
And this is just something you could never get from usage data alone.
And so LabelBox packaged up all these evaluations and included all the agent trajectories and the corrective human edits for the customer to train on.
This is just one example, so go check out how Labelbox can get you high-quality frontier data across domains, modalities, and training paradigms.
Reach out at labelbox.com slash thwarkesh.
Let's talk about RL a bit.
You two did some very interesting things about this.
Conceptually, how should we think about the way that humans are able to build a rich world model just from interacting with our environment and in ways that seems almost irrespective of the final reward at the end of the episode?
If somebody's starting to start a business and at the end of 10 years she finds out whether the business succeeded or failed,
We say that she's earned a bunch of wisdom and experience, but it's not because like the log probs of every single thing that happened over the last 10 years are up-weighted or down-weighted.
It's something much more deliberate and rich is happening.
What is the ML analogy and how does that compare to what we're doing with other ones right now?
But you're so good at coming up with evocative phrases.
Sucking supervision through a straw is, like, so good.
Why hasn't—so you're saying, like, your problem with outcome-based reward is that you have this huge trajectory, and then at the end, you're trying to learn every single possible thing about what you should do and what you should learn about the world from that one final bit.
Why hasn't—given the fact that this is obvious—why hasn't process-based supervision—
as an alternative been a successful way to make models more capable?
What has been preventing us from using this alternative paradigm?
You're basically training the LLM to be a prompt injection model.
So to the extent you think this is the bottleneck to making RL more functional, then that will require making LLMs better judges if you want to do this in an automated way.
And then so is it just going to be like some sort of GAN-like approach where you had to train models to be more robust?
Interesting.
Do you have some shape of what the other idea could be?
Yeah.
So I guess I see a very, not easy, but like I can conceptualize how you would be able to train on synthetic examples or synthetic problems that you have made for yourself.
But there seems to be another thing humans do, maybe sleep is this, maybe daydreaming is this, which is not necessarily come up with fake problems, but just like reflect.
And I'm not sure what the ML analogy for, you know, daydreaming or sleeping, but just like just reflecting.
I haven't come up with a new problem.
Yeah, yeah.
I mean, obviously, the very basic analogy would just be fine-tuning on reflection bits, but I feel like in practice that probably wouldn't work that well.
So I don't know if you have some take on what the analogy of this thing is.
Just to make sure I understood, the reason that the collapse is relevant to synthetic data generation is because you want to be able to come up with synthetic problems or reflections which are not already in your data distribution?
I guess what I'm saying is...
You can't just keep scaling, quote-unquote, reflection on the same amount of prompt information and then get returns from that.
Have you seen this super interesting paper that dreaming is a way of preventing this kind of overfitting and collapse?
That the reason dreaming is an evolutionary adaptive is to put you in weird situations that are very unlike your day-to-day reality to prevent this kind of overfitting?
This is a very ill-formed thought, so I'll just put it out and let you react to it.
The best learners that we are aware of, which are children, are extremely bad at recollecting information.
In fact, at the very earliest stages of childhood, you will forget everything.
You're just an amnesiac about everything that happens before a certain year date.
But you're extremely good at picking up new languages and learning from the world.
And maybe there's some element of being able to see the forest for the trees.
Whereas if you compare it to the opposite end of the spectrum, you have...
LLM pre-training, which these models were literally able to regurgitate word for word what is the next thing in a Wikipedia page.
But their ability to learn abstract concepts really quickly the way a child can is much more limited.
And then adults are somewhere in between where they don't have the flexibility of childhood learning, but they can, you know, adults can memorize facts and information in a way that is harder for kids.
And I don't know if there's something interesting about that
And this is also relevant to preventing model collapse.
Let me think.
What is a solution to model collapse?
I mean, there's very naive things you could attempt.
It's just like the distribution over logits should be wider or something.
Like, there's many naive things you could try.
What ends up being the problem with the naive approaches?
In fact, it's actively penalized, right?
If you're like super creative in RL, it's like not good.
And then I think you hinted that it's a very fundamental problem.
It won't be easy to solve.
What's your intuition for that?
How many bits should the optimal core of intelligence end up being if you just had to make a guess?
The thing we put on the von Neumann probes, how big does it have to be?
That's actually surprising that you think it will take a billion, because already we have a billion parameter models, or a couple billion parameter models that are like very intelligent.
Well, some of our models are like a trillion parameters, right?
But they remember so much stuff.
Yeah, but I'm surprised that in 10 years, given the pace, okay, we have GPT-OSS-20b, that's way better than GPT-4 original, which was a trillion plus parameters.
Yeah.
So given that trend, I'm actually surprised you think in 10 years, the cognitive core is still a billion parameters.
Yeah, I'm surprised you're not like, oh, it's going to be like tens of millions or millions.
But why is the distilled version still a billion?
Is I guess the thing I'm curious about.
Why would you train on... Right, no, no, but why is the distillation in 10 years not getting below 1 billion?
Oh, you think it should be smaller than a billion?
Yeah, I mean, just like if you look at the trend over the last few years, just finding low-hanging fruit and going from like trillion-plus models that are like literally two orders of magnitude smaller in a matter of two years and having better performance.
It makes me think the sort of core of intelligence might be even way, way smaller.
Like plenty of room at the bottom, to paraphrase Feynman.
Yeah.
So we're discussing what, like, plausibly could be the cognitive core.
There's a separate question, which is, what will actually be the size of furniture models over time?
And I'm curious to have a prediction.
So we had increasing scale up to maybe 4.5, and now we're seeing decreasing slash plateauing scale.
There's many reasons that could be going on.
But do you have a prediction about going forward?
Will the biggest models be bigger?
Will they be smaller?
Will they be the same?
Do you think they're looking for it to be similar in kind to the kinds of things that have been happening over the last two to five years?
Like just in terms of like if I look at Nano Chat versus Nano GPT and then the architectural tweaks you made, is that basically like the flavor of things you continue to keep happening?
Or is there – you're not expecting any giant paradigm shift?
Okay.
This is my general manager, Max.
Good to be here.
And you have been here since you were onboarded about six months ago.
Eight months ago.
Oh, right.
Time passes so fast.
But when I onboarded you, I was in France.
And so we basically didn't get the chance to –
talk at all almost.
And you basically just gave me one login.
I gave you access to my Mercury platform, which is the banking platform that I was using at the time to run the podcast.
I mean, Mercury made the experience of all these things I was doing before so seamless that it didn't even occur to me until you pointed it out that this is not the natural way to start a payroll or invoicing or any of these other things.
All right, you heard him.
Visit mercury.com to apply online in minutes.
Cool, thanks, Max.
Thanks for having me.
Dude, you're great at this.
Mercury is a financial technology company, not a bank.
Banking services provided through Choice Financial Group, Column A, and Evolve Bank & Trust, members FDIC.
People have proposed different ways of charting how much progress you've made towards full AGI.
Because if you can come up with some line, then you can see where that line intersects with AGI and where that would happen on the x-axis.
And so people have proposed, oh, it's like the education level.
Like we had a high schooler and then they went to college with RL and they're going to get a PhD.
I don't like that one.
Or then they'll propose horizon length, so maybe they can do tasks that take a minute, they can do those autonomously, then they can autonomously do tasks that take a human an hour, a human a week, etc.
How do you think about what is the relevant y-axis here?
How should we think about how AI is making progress?
I wonder with radiologists,
I'm totally speculating.
I have no idea what the actual workflow of a radiologist involves.
But one analogy that might be applicable is when WAMOs are first being rolled out, there'd be a person sitting in the front seat, and you just had to have them there to make sure that if something went really wrong, they're there to monitor.
And I think even today, people are still watching to make sure things are going well.
Robotaxi, which was just deployed, actually still has a person inside it.
And we could be in a similar situation where
If you automate 99% of a job, that last 1% the human has to do is incredibly valuable because it's bottlenecking everything else.
And if it was the case with radiologists where the person sitting in the front of the Uber or the front of the Waymo has to be specially trained for years in order to be able to provide the last 1%, their wages should go up tremendously because they're the one thing bottlenecking wide deployment.
So radiologists, I think their wages have gone up for similar reasons.
If you're the last bottleneck, you're like...
And you're not fungible, which, like, you know, a Waymo driver might be fungible with other things.
So you might see this thing where, like, your wages go, like, whoop, and then until you get to 90%, and then, like, just like that.
And then the last 1% is gone.
And I wonder if we're seeing similar things with radiology or salaries of call center workers or anything like that.
I think there's been evidence that that's already been happening generally in companies that have been adopting AI, which I think is quite surprising.
And I also find what was really surprising, okay, AGI, right?
Like a thing which would do everything and, okay, we'll take out physical work.
It's a thing which should be able to do all knowledge work.
And what you would have naively anticipated that the way this regression would happen is like you would take a little task that –
consultant is doing, you take that out of the bucket.
You take a little task that an accountant is doing, you take that out of the bucket.
And then you're just doing this across all knowledge work.
But instead, if we do believe we're on the path of AGI with the current paradigm, the progression is very much not like that.
At least...
It just does not seem like consultants and accounts and whatever are getting like huge productive improvement.
It's very much like programmers are like getting more and more chills of the way of their work.
If you look at the revenues of these companies, discounting just like normal chat revenue, which I think is like, I don't know, that's similar to like Google or something.
just looking at API revenues, it's like dominated by coding, right?
So this thing which is general, quote unquote, which should be able to do any knowledge work, is just overwhelmingly doing only coding.
And it's a surprising way that you would expect like the AGI to be deployed.
I actually, I'm not sure if that alone explains it because...
I personally have tried to get LLMs to be useful in domains which are just pure language in, language out.
Like rewriting transcripts, like coming up with clips based on transcripts, etc.
And you might say, well, it's very plausible that I didn't do every single possible thing I could do.
I put a bunch of good examples in context, but maybe I should have done some kind of fine-tuning, whatever.
So our mutual friend, Annie Matuszak, told me that
He actually tried 50 billion things to try to get models to be good at writing spaced repetition prompts.
Again, very much language in, language out tasks, the kind of thing that should be dead center in the repertoire of these LLMs.
And he tried in-context learning, obviously, with a few short examples.
He tried, I think, he told me a bunch of things, like supervised fine-tuning and retrieval, whatever.
And he just could not get them to make cards to a satisfaction.
So I find it striking that even in language out domains, it's actually very hard to get a lot of economic value out of these models separate from coding.
And I don't know what explains it.
How do you think about superintelligence?
Do you expect it to feel qualitatively different from normal humans or human companies?
roughly speaking.
I guess automation includes the things humans can already do and super intelligence supplies things to humans.
Yeah.
But I guess maybe less abstractly and more sort of like qualitatively, do you expect something to feel like, okay, because this thing can either think so fast or has so many copies or the copies can merge back in themselves or is quote unquote much smarter, any number of advantages an AI might have.
It will qualitatively, the civilization in which these AIs exist will just feel qualitatively different from human civilization.
Let me probe on that a bit.
It's not clear to me that loss of control and loss of understanding are the same things.
A board of directors at, like, whatever, TSMC, Intel, name a random company, they're just, like, prestigious 80-year-olds.
They have very little understanding.
And maybe they don't practically actually have control.
But...
Or, actually, maybe a better example is the president of the United States.
The president has a lot of fucking power.
I'm not trying to make a good statement about the current operant, but maybe I am.
But, like, the actual level of understanding is very different from the level of control.
How come?
I mean, the loss of understanding is obvious, but why a loss of control?
It is not the fact that they are smarter than us that is resulting in a loss of control.
It is the fact that they are competing with each other and whatever arises out of that competition that leads to the loss of control.
Yeah, yeah.
This is a question I should have asked earlier.
So we were talking about how currently it feels like when you're doing AI engineering or AI research, these models are more like in the category of compiler rather than in the category of a replacement.
At some point, if you have quote-unquote AGI, it should be able to do what you do.
And do you feel like having a million copies of You in Parallel results in some huge speed-up of AI progress?
Basically, if that does happen, do you expect to see an intelligence explosion?
Or even once we have a true HA, I'm not talking about LLMs today, but real HA.
You think it's continuous with this hyper-exponential trend?
Are you saying that what will happen is, so if you look at the trend before the Industrial Revolution to currently, you have a hyper-exponential where you go from 0% growth to then 10,000 years ago, 0.02% growth, and then currently we're at 2% growth.
So that's a hyper-exponential, and you're saying if you're charting AI on there, then it's like AI takes you to 20% growth or 200% growth.
Mm-hmm.
Or you could be saying, if you look at the last 300 years, what you've been seeing is you have technology after technology, computers, electrification, steam, steam engines, railways, et cetera.
But the rate of growth is the exact same.
It's 2%.
So are you saying the rate of growth will... No, I basically, I expect the rate of growth has also stayed roughly constant, right?
For only the last 200, 300 years.
But over the course of human history, it's like exploded, right?
It's like gone from like 0% basically to like faster, faster, faster, industrial explosion, 2%.
But just to clarify, you're saying that the rate of growth will not change.
Like, you know, the intelligence explosion will show up as like... It just enabled us to continue staying on the 2% growth trajectory just as the internet helped us stay on the 2% growth trajectory.
Yeah.
I mean, just to throw the opposite argument against you...
My expectation is that it like blows up because I think true AGI, and I'm not talking about LLM coding bots, I'm talking about like actual, this is like a replacement of a human in a server, is qualitatively different from these other productivity improving technologies.
Because it's labor itself, right?
I think we live in a very labor-constrained world.
If you talk to any startup founder or any person, you can just be like, okay, what do you need more of?
You just need really talented people.
And if you just have billions of extra people who are inventing stuff, integrating themselves, making companies' bottoms start to finish, that feels qualitatively different from just like...
A single technology.
It's sort of like just asking if you get 10 billion extra people on the planet.
Yeah.
I guess you have a machine which is spitting out more things like that at potentially a faster pace.
And so we historically have examples of the growth regime changing where, like, you went from, you know, 0.2% growth to 2% growth.
So it seems very plausible to me that, like, a machine which is then spitting out the next self-driving car and the next internet and whatever.
I think what often ends up being misleading in these conversations is people, I don't like to use the word intelligence in this context, because intelligence implies you think like, oh, a super intelligence will be sitting, there'll be a single super intelligence sitting in a server and it'll like divine how to come up with new technologies and inventions that causes this explosion.
And that's not what I'm imagining when I'm imagining 20% growth.
I'm imagining that there's billions of, you know, basically like very smart human-like minds potentially, or that's all that's required.
But the fact that there's hundreds of millions of them, billions of them, each individually growing,
making new products, figuring out how to integrate themselves into the economy.
Just the way if like a highly experienced smart immigrant came to the country, you wouldn't need to like figure out how we integrate them in the economy.
They figure it out.
They could start a company.
They could like make inventions, you know, or like just increase productivity in the world.
And we have examples even in the current regime of places that have had 10, 20% economic growth.
You know, if you just have a lot of people and less capital in comparison to the people, you know,
You can have Hong Kong or Shenzhen or whatever just had decades of 10% plus growth.
And I think there's a lot of really smart people who are ready to make use of the resources and do this period of catch up because we've had this discontinuity.
And I think, yeah, it might be similar.
I mean, the Industrial Revolution is such a jump, right?
You went from 0.2% growth to 2% growth.
I'm just saying you'll see another jump like that.
I actually don't think... I mean, the crucial thing about the industrial revolution was that it was not magical, right?
Like, if you just zoomed in, what you would see in 1770 or 1870...
is not that there was some key invention.
Yeah, exactly.
But at the same time, you did move the economy to a regime where the progress was much faster and the exponential 10x'd.
And I expect a similar thing from AI, where it's not like there's going to be a single moment where...
We made the crucial invention.
Yeah.
And I mean, maybe one way to think about it is through history, a lot of growth, I mean, growth comes because people come up with ideas and then people are like out there doing stuff to execute those ideas and make valuable output.
And through most of this time, population isn't exploding.
That has been driving growth.
For the last 50 years, people have argued that growth is stagnated.
Population in frontier countries is also stagnated.
I think we go back on the hyper exponential growth in population and output.
Sorry, exponential growth in population that causes hyper-exponential growth in output.
So we just got access to Google's VO 3.1, and it's been really cool to play around with.
The first thing we did was run a bunch of problems through both VO 3 and 3.1 to see what's changed in the new version.
So here's VO 3.
Hi, I'm Max, and I got stuck in a local minimum again.
And here's VO 3.1.
Hi, I'm Max, and I got stuck in a local minimum again.
3.1's output is just consistently more coherent, and the audio is noticeably higher quality.
We've been using Vio for a while now, actually.
We released an essay earlier this year about AI firms fully animated by Vio 2, and it's been amazing to see how fast these models are improving.
This update makes Vio even more useful in terms of animating our ideas and our explainers.
You can try Vio right now in the Gemini app.
with pro and ultra subscriptions.
You can also access it through the Gemini API or through Google Flow.
You recommended Nick Lane's book to me, and then on that basis, I also find it super interesting, and I interviewed him.
And so I actually have some questions about sort of thinking about intelligence in evolutionary history.
Now that you, over the last 20 years of doing AI research, you maybe have a more tangible sense of what intelligence is, what it takes to develop it.
Are you more or less surprised as a result that evolution just sort of spontaneously stumbled upon it?
Okay, so there's actually a couple of interesting follow-ups.
If you buy the Sun perspective, that actually the crux of intelligence is animal intelligence.
What the quote he said is, if you got to the squirrel, you'd be most of the way to AGI.
Then we got to squirrel intelligence, I guess, right after the Cambrian explosion 600 million years ago.
It seems like what instigated that was the oxygenation event 600 million years ago.
But immediately, the sort of like intelligence algorithm was there to like make the squirrel intelligence, right?
So...
It's suggestive that animal intelligence was like that.
As soon as you had the oxygen in the environment, you had the Ecuriot, you could just like get the algorithm.
Maybe there was like sort of an accident that evolution stumbled upon it so fast, but I don't know if that suggests it's actually quite, at the end, going to be quite simple.
A former guest, Guern, and also Carl Schulman, have made a really interesting point about that, which is their perspective is that the scalable algorithm which humans have and primates have arose in birds as well, and maybe other times as well.
But humans found an evolutionary niche which rewarded marginal increases in intelligence.
And also had a scalable brain algorithm that could achieve those increases in intelligence.
And so, for example, if a bird had a bigger brain, it would just, like, collapse out of the air.
So it's very smart for the size of its brain, but it's, like, it's not in a niche which rewards the brain getting bigger.
Yeah.
Maybe similar with some really smart— Like dolphins, et cetera.
Exactly, yeah.
Whereas humans, you know, like we have hands that like reward being able to learn how to do tool use, we can externalize digestion, more energy to the brain, and that kicks off the flywheel.
The way Byrne put it is the reason it was so hard is it's a very tight line between being in a situation where something is so important to learn and
that it's not just worth distilling the exact right circuits directly back into your DNA versus it's not important enough to learn at all.
It has to be something which is like, you have to incentivize building the algorithm to learn
In lifetime.
So Quentin Pope had this interesting blog post where he's saying the reasoning doesn't expect a sharp takeoff is...
So humans had the sharp takeoff where 60,000 years ago, we seem to have had the cognitive architectures that we have today.
And 10,000 years ago, agricultural revolution, modernity, dot, dot, dot.
What was happening in that 50,000 years?
Well, you had to build this sort of like cultural scaffold where you can accumulate knowledge over generations.
This is an ability that exists for free in the way we do AI training today.
Where if you retrain a model, it can still, I mean, in many cases, they're literally distilled, but they can be trained on each other.
You know, they can be trained on the same pre-training corpus.
They don't literally have to start from scratch.
So there's a sense in which the thing which it took humans a long time to get this cultural loop going just comes for free with the way we do LLM training.
When would you expect that kind of thing to start happening?
And more general question about like multi-agent systems and a sort of like independent AI civilization and culture.
And can you identify the key bottleneck that's preventing this kind of collaboration between LLMs?
Maybe like the way I would put it is...
Yeah.
So you've talked about how you were at Tesla leading self-driving from 2017 to 2022.
And then you firsthand saw this progress from, we went from cool demos to now thousands of cars out there actually autonomously doing drives.
Why did that take a decade?
Like what was happening through that time?
That's very interesting to hear you say that the sort of safety guarantees you need from software are actually not dissimilar to self-driving because what people will often say is that self-driving took so long because the cost of failure is so high.
Like a human makes a mistake on average every 400,000 miles or every seven years.
And if you had to release a coding agent that couldn't make a mistake for at least seven years, it would be much harder to deploy.
But I guess your point is that if you made a catastrophic coding mistake, like breaking some important system every seven years.
And in fact, in terms of sort of wall clock time, it would be much less than seven years because you're like constantly outputting code like that, right?
So it's like per tokens, or in terms of tokens, it would be seven years.
But in terms of wall clock time, it would be pretty close.
There's another objection people make to that analogy, which is that with self-driving, what took a big fraction of that time was solving the problem of having basic perception that's robust and building representations and having a model that has some common sense so it can generalize to when it sees something that's slightly out of distribution.
If somebody's waving down the road this way, you don't need to train for it.
The thing will...
have some understanding of how to respond to something like that.
And these are things we're getting for free with LLMs or VLMs today.
So we don't have to solve these very basic representation problems.
And so now deploying AIs across different domains will sort of be like deploying a self-driving car with current models to a different city, which is hard, but not like a 10-year-long task.
You let self-driving for five years at Tesla.
Because one, the start is at 1980, not 10 years ago.
And then two, the end is not here yet.
I'm curious to bounce two other ways in which the analogy might be different.
And the reason I'm especially curious about this is because I think the question of how fast AI is deployed, how valuable it is when it's early on is like potentially the most important question in the world right now, right?
Like if you're trying to model what the year 2030 looks like, this is the question you want to have some understanding of.
So another thing you might think is, one, you have this latency requirement with self-driving where you have – I have no idea what the actual models are, but I assume like tens of millions of parameters or something, which is not the necessary constraint for knowledge work with LLMs.
Or maybe it might be with computer use and stuff.
But anyways, the other big one is – maybe more importantly, on this CapEx question –
Yes, there is additional cost to serving up an additional copy of a model, but the sort of OPEX of a session is quite low and you can amortize the cost of AI into the training run itself, depending on how inference scaling goes and stuff.
But it's certainly not as much as like building a whole new car to serve another instance of a model.
So it just, the economics of deploying more widely are much more favorable.
The latency requirements and the implications for model size.
Yeah.
Do you have any opinions on whether this implies that the current AI build-out, which would like 10x the amount of available compute in the world in a year or two, and maybe like 100, more than 100x by the end of the decade, if the use of AI will be lower than some people naively predict, does that mean that we're overbuilding compute?
Or is that a separate question?
Yeah, that's right.
Let's talk about education in Eureka and stuff.
One thing you could do is start another AI lab and try to solve those problems.
Yeah, curious what you're up to now.
And then, yeah, why not AI research itself?
And so what are you working on there?
A category of questions I have for you is just explaining
how one teaches technical or scientific content well, because you are one of the world masters at it.
And then I'm curious both about how you think about it for content you've already put out there on YouTube, but also to the extent it's any different, how you think about it for Eureka.
But you are building it, right?
Mm-hmm.
To the extent you're willing to say it, what is the thing you hope will be released this year or next year?
Yeah, so you're imagining the short term that instead of a tutor being able to probe your understanding, if you have enough self-awareness to be able to probe yourself, you're never going to be stuck.
You can find the right answer between talking to the TA or talking to an LLM and looking at the reference implementation.
It sounds like...
Automation or AI is actually not as significant.
So far, the big alpha here is your ability to explain AI codified in the source material of the class, right?
That's fundamentally what the course is.
And so when you imagine what is available through Eureka in a couple of years, it seems like the big bottleneck is going to be finding Karpathis in field after field who can convert their understanding into these ramps, right?
But are you imagining that, like, people who have expertise in other fields are then contributing courses?
Or do you feel like it's actually quite essential to the vision that you, given your understanding of how you want to teach, are the one designing the content?
Like, I don't know, Sal Khan is, like, narrating all the videos of Khan Academy.
Are you imagining something like that?
Yeah.
Yeah, I think you're basically inventing college from first principles for the tools that are available today and then just like for just like selecting for people who have the motivation and the interest of actually really engaging with material.
I mean, that sounds different from...
Using this, as opposed to AGI, you're using this to basically as entertainment or as like a self-betterment.
But it sounded like you had a vision also that this education is relevant to keeping humanity in control of AI.
I see.
And they sound different.
And I'm curious, is it like it's entertaining for some people, but then empowerment for some others?
How do you think about that?
Now that I'm understanding the vision,
That's very interesting.
Like, I think it actually has a perfect analog in gym culture.
I don't think 100 years ago anybody would be, like, ripped.
Like, nobody would have, you know, be able to, like, just spontaneously bench two plays or three plays or something.
And it's actually very common now.
And you're... Because this idea of systematically training and lifting weights in the gym or systematically training to be able to run a marathon, which is a capability spontaneously you would not have or most humans would not have.
And you're imagining similar things for...
learning across many different domains much more intensely, deeply, faster.
Yeah, exactly.
I guess it's still a world in which that is not enabling us to...
It's like the culture world, right?
Like you're not fundamentally going to be able to like transform the trajectory of technology or influence decisions by your own labor or cognition alone.
Maybe you can influence decisions because the AI is asking for your approval, but you're not like, it's not because I've like, I can, because I've invented something or I've like come up with a new design, I'm like really influencing the future.
Yeah.
I love this vision.
I also... It's like... I feel like the person you have most product market fit with is me, because my job involves having to learn different subjects every week.
And I am very excited if you can...
Yeah.
I think you also made a point that was subtle, so just to spell it out.
I think what's happened so far with online courses is that why haven't they already enabled us to... enabled every single human to know everything?
And I think they're just so motivation-laden because there's not obvious on-ramps, and it's so easy to get stuck.
And if you had...
Instead, this thing, basically like a really good human tutor.
It would just be such an unlock from a motivation perspective.
Yeah, I think so.
Can I ask some questions about teaching well?
If you had to give advice to another educator in another field that you're curious about to make the kinds of YouTube tutorials you've made, maybe it might be especially interesting to talk about domains where you can't test somebody's technical understanding by having them code something up or something.
What advice would you give them?
It also just makes a learning experience so much more motivated.
Your tutorial on the Transformer begins with...
It's literally like a lookup table from here's the word right now, or here's the previous word, here's the next word, and it's literally just a lookup table.
Yeah, it's the essence of it, yeah.
I mean, it's such a brilliant way.
Like, okay, start with a lookup table and then go to a transformer, and then each piece is motivated.
Why would you add that?
Why would you add the next thing?
You couldn't memorize this sort of attention formula, but just like having an understanding of why every single piece is relevant, what problem it solves.
Because if you try to come up with it yourself, I guess you get a better understanding of, like, what is the action space and then what is the sort of, like, objective?
Then, like, why does only this action fulfill that objective, right?
Why do you think, by default, people who are genuine experts in their field are often bad at explaining it to somebody ramping up?
Another trick like that that just works astoundingly well.
If somebody writes a paper or a blog post or an announcement, it is in 100% of cases true that just the narration or the transcription of how they would explain it to you over lunch
is way more not only understandable, but actually also more accurate and scientific in the sense that people have a bias to explain things in the most abstract, jargon-filled way possible and to clear their throat for four paragraphs before they explain the central idea.
Yeah.
But there's something about communicating one-on-one with a person which compels you to just say the thing.
Right.
Exactly.
This is coming from the perspective of how somebody who's trying to explain an idea should formulate it better.
What is your advice to...
As a student to other students, where if you don't have a Karpathy who is doing the exposition of an idea, if you're reading a paper from somebody or reading a book, what strategies do you employ to learn material you're interested in in fields you're not an expert in?
Oh, yeah.
I think that's an excellent note to close on.
Yeah.
Andre, that was great.
Yeah, thank you.
Thanks.
Hey, everybody.
I hope you enjoyed that episode.
If you did, the most helpful thing you can do is just share it with other people who you think might enjoy it.
It's also helpful if you leave a rating or a comment on whatever platform you're listening on.
If you're interested in sponsoring the podcast, you can reach out at dwarkesh.com slash advertise.
Otherwise, I'll see you in the next one.
Well, it's not saying that you just want to throw away as much compute as you possibly can.
The Bitter Lesson says that you want to come up with techniques which most effectively and scalably leverage compute.
Most of the compute that's spent on an LLM is used in running it during deployment.
And yet it's not learning anything during this entire period.
It's only learning during this special phase that we call training.
And so this is obviously not an effective use of compute.
And what's even worse is that this training period by itself is highly inefficient because these models are usually trained on the equivalent of tens of thousands of years of human experience.
And what's more, during this training phase,
all of their learning is coming straight from human data.
Now, this is an obvious point in the case of pre-training data, but it's even kind of true for the RLVR that we do with these LLMs.
These RL environments are human-furnished playgrounds to teach LLMs the specific skills that we have prescribed for them.
The agent is in no substantial way learning from organic and self-directed engagement with the world.
Having to learn only from human data, which is an inelastic and hard to scale resource, is not a scalable way to use compute.
Furthermore, what these LLMs learn from training is not a true world model, which would tell you how the environment changes in response to different actions that you take.
Rather, they're building a model of what a human would say next.
And this leads them to rely on human-derived concepts.
A way to think about this would be, suppose you trained an LLM on all the data up to the year 1900.
That LLM probably wouldn't be able to come up with relativity from scratch.
And maybe here's a more fundamental reason to think this whole paradigm will eventually be superseded.
LLMs aren't capable of learning on the job, so we'll need some new architecture to enable this kind of continual learning.
And once we do have this architecture, we won't need a special training phase.
The agents will just be able to learn on the fly, like all humans, and in fact, like all animals are able to do.
And this new paradigm will render our current approach with LLMs and their special training phase that's super sample and efficient totally obsolete.
So that's my understanding of Rich's position.
My main difference with Rich is just that I don't think the concepts he's using to distinguish LLMs from true intelligence or animal intelligence are actually that mutually exclusive or dichotomous.
For example, I think imitation learning is continuous with and complementary to RL.
And relatedly, models of humans can give you a prior which facilitates learning quote-unquote true world models.
I also wouldn't be surprised if some future version of test-time fine-tuning could replicate continual learning, given that we've already managed to accomplish this somewhat with in-context learning.
So let's start with my claim that imitation learning is continuous with and complementary to RL.
So I tried to ask Richard a couple of times whether free-trained LLMs can serve as a good prior on which we can accumulate the experiential learning, aka do the RL, which would lead to AGI.
So Ilya Seskovor gave a talk a couple months ago that I thought was super interesting, and he compared pre-training data to fossil fuels.
And I think this analogy actually has remarkable reach.
Just because fossil fuels are not a renewable resource does not mean that our civilization ended up on a dead-end track by using them.
In fact, they were absolutely crucial.
You simply couldn't have transitioned from the water wheels of 1800 to solar panels and fusion power plants.
We had to use this cheap, convenient, and plentiful intermediary to get to the next step.
AlphaGo, which was conditioned on human games, and AlphaZero, which was bootstrapped from scratch, were both superhuman Go players.
Now, of course, AlphaZero was better.
So you can ask the question, will we or will the first AGIs eventually come up with a general learning technique that requires no initialization of knowledge and that just bootstraps itself from the very start?
And will it outperform the very best AIs that have been trained up to that date?
I think the answer to both these questions is probably yes.
But does this mean that imitation learning must not play any role whatsoever in developing the first AGI or even the first ASI?
AlphaGo is still superhuman despite being initially shepherded by human player data.
The human data isn't necessarily actively detrimental.
It's just that at enough scale, it isn't significantly helpful.
AlphaZero also uses much more compute than AlphaGo.
The accumulation of knowledge over tens of thousands of years has clearly been essential to humanity's success.
In any field of knowledge, thousands and probably actually millions of previous people were involved in building up our understanding and passing it on to the next generation.
We obviously didn't invent the language we speak, nor the legal system we use.
Also, even most of the technologies in our phone were not directly invented by the people who are alive today.
This process is more analogous to imitation learning than it is to RL from scratch.
Now, of course, are we literally predicting the next token like an LLM would in order to do this cultural learning?
No, of course not.
So even the imitation learning that humans are doing is not like the supervised learning that we do for pre-training LLMs.
But neither are we running around trying to collect some well-defined scale or reward.
No ML learning regime perfectly describes human learning or animal learning.
We're doing things which are both analogous to RL and to supervised learning.
What planes are to birds, supervised learning might end up being to human cultural learning.
I also don't think these learning techniques are actually categorically different.
Imitation learning is just short horizon RL.
The episode is a token long.
The LLM is making a conjecture about the next token based on its understanding of the world and how the different pieces of information in the sequence relate to each other.
And it receives reward in proportion to how well it predicted the next token.
Now, of course, I already hear people saying, no, no, that's not the ground truth.
It's just learning what a human was likely to say.
But there's a different question, which I think is actually more relevant to understanding the scalability of these models.
And that question is, can we leverage this imitation learning to help models learn better from ground truth?
And I think the answer is obviously yes.
After RRLing these pre-trained base models, we've gotten them to win gold in international Math Olympiad competitions and to code up entire working applications from scratch.
Now, these are ground truth examinations.
Can you solve this unseen Math Olympiad question?
Can you build this application to match the specific features request?
But you couldn't have RL'd a model to accomplish these tasks from scratch, or at least we don't know how to do that yet.
You needed a reasonable prior over human data in order to kickstart this RL process.
Whether you want to call this prior a proper world model or just a model of humans, I don't think is that important.
It honestly seems like a semantic debate.
Because what you really care about is whether this model of humans has...
helps you start learning from ground truth, aka become a true world model.
It's a bit like saying to somebody pasteurizing milk, hey, you should stop boiling that milk because eventually you want to serve it cold.
Of course, but this is an intermediate step to facilitate the final output.
By the way, LLMs are clearly developing a deep representation of the world because their training process is incentivizing them to develop one.
I use LLMs to teach me about everything from biology to AI to history, and they are able to do so with remarkable flexibility and coherence.
Now, are LLMs specifically trained to model how their actions will affect the world?
No, they are not.
But if we're not allowed to call their representations a world model,
then we're defining the term world model by the process that we think is necessary to build one, rather than the obvious capabilities that this concept implies.
Okay, continual learning.
I'm sorry to bring up my hobby horse again.
I'm like a comedian who has only come up with one good bit, but I'm going to milk it for all it's worth.
An LLM that's being RL'd on outcome-based rewards learns on the order of one bit per episode, and an episode might be tens of thousands of tokens long.
Now, obviously, animals and humans are clearly extracting more information from interacting with our environment than just the reward signal at the end of an episode.
Conceptually, how should we think about what is happening with animals?
I think we're learning to model the world through observations.
This outer loop RL is incentivizing some other learning system to pick up maximum signal from the environment.
In Richard's oak architecture, he calls this the transition model.
And if we were trying to pigeonhole this feature spec into modern LLMs, what you do is fine tune on all your observed tokens.
From what I hear from my researcher friends, in practice, the most naive way of doing this actually doesn't work very well.
Now, being able to learn from the environment in a high throughput way is obviously necessary for true AGI.
And it clearly doesn't exist with LLMs trained on RLVR.
But there might be some other relatively straightforward ways to shoehorn continual learning atop LLMs.
For example, one could imagine making supervised fine tuning a tool call for the model.
So the outer loop RL is incentivizing the model to teach itself effectively using supervised learning in order to solve problems that don't fit in the context window.
Now, I'm genuinely agnostic about how well techniques like this will work.
I'm not an AI researcher.
but I wouldn't be surprised if they basically replicate continual learning.
And the reason is that models are already demonstrating something resembling human continual learning within their context windows.
The fact that in-context learning emerged spontaneously from the training incentive to process long sequences makes me think that if information could just flow across windows longer than the context limit, then models could meta-learn the same flexibility that they already show in context.
Okay, some concluding thoughts.
Evolution does meta-RL to make an RL agent, and that agent can selectively do imitation learning.
With LLMs, we're going the opposite way.
We have first made this base model that does pure imitation learning, and then we're hoping that we do enough RL on it to make a coherent agent with goals and self-awareness.
Maybe this won't work.
But I don't think these super first principles arguments about, for example, how these LMs don't have a true world model are actually proving much.
And I also don't think they're strictly accurate for the models we have today, which are actually undergoing a lot of RL on ground truth.
Even if Sutton's platonic ideal doesn't end up being the path to the first AGI,
His first principles critique is identifying some genuine basic gaps that these models have.
And we don't even notice them because they're so pervasive in the current paradigm, but because he has this decades-long perspective, they're obvious to him.
It's the lack of continual learning.
It's the abysmal sample efficiency of these models.
It's their dependence on exhaustible human data.
If the LLMs do get to HEI first, which is what I expect to happen, the successor systems that they build will almost certainly be based on Richard's vision.
Today, I'm chatting with Richard Sutton, who is one of the founding fathers of reinforcement learning and inventor of many of the main techniques used there, like TD learning and policy gradient methods.
And for that, he received this year's Turing Award, which, if you don't know, is basically the Nobel Prize for Computer Science.
Richard, congratulations.
Thank you, Dvarkis.
And thanks for coming on the podcast.
It's my pleasure.
Okay, so first question is,
My audience and I are familiar with the LLM way of thinking about AI.
Conceptually, what are we missing in terms of thinking about AI from the RL perspective?
Huh.
I guess you would think that to emulate the trillions of tokens in the corpus of internet text, you would have to build a world model.
In fact, these models do seem to have very robust world models, and they're the best world models we've made to date in AI, right?
So what do you think that's missing?
Great.
Yeah.
Right.
I guess maybe the crux, and I'm curious if you disagree with this, is some people will say, okay, so...
This imitation learning has given us a good prior, given these models a good prior, but reasonable ways to approach problems.
And as we move towards the era of experience, as you call it, this prior is going to be the basis on which we teach these models from experience because this gives them the opportunity to get answers right some of the time.
And then on this, you can build, you can train them on experience.
Do you agree with that perspective?
I mean, I think they do.
You can literally ask them, what would you anticipate a user might say in response?
And they have a prediction.
Yeah.
Yeah.
So I think a capability like this does exist in context.
So it's interesting to watch a model do chain of thought, and then suppose it's trying to solve a math problem.
It'll say, okay, I'm going to approach this problem using this approach at first, and it'll write this out and be like, oh, wait, I just realized this is the wrong conceptual way to approach the problem.
I'm going to restart by this another approach.
And that flexibility is
does exist in context, right?
Do you have something else in mind, or do you just think that you need to extend this capability across longer horizons?
Isn't that literally what next token prediction is?
Prediction of what was next and then updating on the surprise?
Next token is what they should say, what the action should be.
Oh, yeah.
It's not a goal about the external world.
I guess maybe the bigger question I want to understand is why you don't think doing RL on top of LLMs is a productive direction.
Because we seem to be able to give these models the goal of solving difficult math problems.
And they're in many ways at the very peaks of human level in the capacity to solve Math Olympia-type problems, right?
They got gold at IMO.
So it seems like the model which got gold at the International Math Olympia does have the goal of getting math problems, right?
So why can't we extend this to different domains?
Right.
So, I mean, it's interesting because you wrote this essay in 2019 titled The Bitter Lesson, and this is the most influential essay perhaps in the history of AI, but people have used that as a justification for,
for scaling up LLMs, because in their view, this is the one scalable way we have found to pour ungodly amounts of compute into learning about the world.
And so it's interesting that your perspective is that the LLMs are actually not bitter lesson told.
I guess that doesn't seem like the crux to me because I think those people would also agree that the overwhelming amount of compute in the future will come from learning from experience.
They just think that the scaffold or the basis of that
The thing you'll start with in order to pour in the compute to do this future experiential learning or on the job learning will be LLMs.
And so I guess I still don't understand why this is the wrong starting point altogether.
Why we need a whole new architecture to begin doing experiential continual learning and why we can't start with LLMs to do that.
Maybe it's interesting to compare this to humans.
So in both the case of learning from imitation versus experience and on the question of goals, I think there's some interesting analogies.
So, you know, kids will initially learn from imitation.
You don't think so?
No, of course not.
Really?
Yeah.
I think kids just like watch people.
They like kind of try to like say the same words.
I think the level- What about the first six months?
I think they're kind of imitating things.
They're trying to like make their mouth sound the way they see their mother's mouth sound.
And then they'll say the same words without understanding what they mean.
And as they get older, the complexity of the imitation they do increases.
So you're imitating maybe the skills that your people in your band are using to hunt down the deer or something.
And then you go into the learning from experience RR regime.
But I think there's a lot of imitation learning happening with humans.
uh infant actually does there's no targets for that there are no examples for that I agree that doesn't explain everything infants do but I think it guides the learning process I mean even uh llm when it's trying to predict the next token early in training it will like make a guess it'll be different from what like it actually sees and in some sense it's like very short horizon RL where it's like making this guess of like I think this token will be this it's actually the other thing similar to how a kid will try to say a word it comes out wrong
I think this is maybe more of a semantic distinction.
Like, what do you call school?
Is that not training data?
You're not going to school because it's like... School is much later.
You shouldn't base your theories on that.
But the idea of having phases of learning where... I think you're just sort of programming your biology that early on you're not that useful.
And then kind of why you exist is to...
understand the world and like learn how to interact with it um and it seems kind of like a training phase i agree that then there's like a sort of more gradual there's not a sharp cut off to like training to deployment but there seems to be this like initial training phase right there's nothing where you have training of what you should do there's nothing you you you see things that happen you're not you're not told what to do
I mean, you're like literally taught what to do.
This is like where the word training comes from, is from humans, right?
So I interviewed this psychologist and anthropologist, Joseph Henrich, who has done work about cultural evolution and basically what distinguishes humans and how do humans pick up knowledge.
I mean, we're trying to replicate intelligence, right?
So if you want to understand what is it that enables humans to go to the moon or to build semiconductors, I think the thing we want to understand is the thing that makes, no animal can go to the moon or make semiconductors.
So we want to understand what makes humans special.
Yeah, I think argument is useful.
But I do want to complete this thought.
So Joseph Henrich has this interesting theory that if you look...
A lot of the skills that humans have had to master in order to be successful.
And we're not talking about, you know, last thousand years or last ten thousand years, but hundreds of thousands of years.
You know, the world is really complicated and it's not possible to reason through how to, let's say, hunt a seal if you're living in the Arctic.
And so there's this many, many step long process of
of how to make the bait and how to find the seal and then how to process the food in a way that makes sure you won't get poisoned.
And it's not possible to reason through all of that.
And so over time, yes, there's this like larger process of
whatever analogy you want to use, maybe RL or something else, where culture as a whole has figured out how to find and kill and eat seals.
But then what is happening when through generations this knowledge is transmitted is
is in his view that you just have to imitate your elders in order to learn that skill because you can't think your way through how to hunt and kill and process a seal.
You have to just watch other people maybe make tweaks and adjustments.
And that's how cultural knowledge accumulates.
But the initial step of the cultural gain has to be imitation.
But maybe you think about it a different way.
I do think you make a very interesting point that continual learning is a capability that most mammals have.
I guess all mammals have.
So it's quite interesting that we have something that all mammals have, but our AI systems don't have, right?
Whereas maybe like the ability to understand math and solve difficult math problems depends on how you define math.
But like this is a capability our AIs have, but that almost no animal has.
And so it's quite interesting what ends up being difficult and what ends up being easy.
Perilics.
That's right.
For the era of experience to commence, we're going to need to train AIs in complex real-world environments.
But building effective RL environments is hard.
You can't just hire a software engineer and have them write a bunch of cookie-cutter validation tests.
Real-world domains are messy.
You need deep subject matter experts to get the data, the workflows, and all the subtle rules right.
When one of Labelbox's customers wanted to train an agent to shop online, Labelbox assembled a team with a ton of experienced engineering internet storefronts.
For example, the team built a product catalog that could be updated during the episode because most shopping sites have constantly changing state.
They also added a Redis cache to simulate stale data since that's how real e-commerce sites actually work.
These are the kinds of things that you might not have naively thought to do, but that Labelbox can anticipate.
These details really matter.
Small tweaks are often the difference between cool demos and agents that can actually operate in the real world.
So whether it's correcting traces that you already produced or building an entirely new suite of environments, Labelbox can help you turn your RL projects into working systems.
Reach out at labelbox.com slash dvarkash.
All right, back to Richard.
This alternative paradigm that you're imagining.
The experiential paradigm.
Yes.
Yeah.
I guess maybe what I meant to say is human level general continual learning agent.
Yeah.
What is the reward function?
Is it just predicting the world?
Is it then having a specific effect on it?
What would the general reward function be?
I see.
I guess this AI would be deployed...
to like lots of people would want it to be doing lots of different kinds of things.
So it's performing the task people want, but at the same time, it's learning about the world from doing that task.
And do you imagine, okay, so we get rid of this paradigm where there's training periods and then there's deployment periods.
But then do we also get rid of this paradigm when there's the model and then instances of the model or copies of the model that are doing certain things?
How do you think about the fact that we'd want this thing to be doing different things?
We'd want to aggregate the knowledge that it's gaining from doing those different things.
I agree that the kind of thing you're talking about is necessary regardless of whether you start from LLMs or not, right?
If you want human or animal level intelligence, you're going to need this capability.
Suppose a human is trying to make a startup, right?
And this is a thing which has a reward on the order of 10 years.
Once in 10 years, you might have an exit where you get paid out a billion dollars.
But humans have this ability to make intermediate auxiliary rewards or have some way of, even when they have extremely sparse rewards, they can still make intermediate steps, having an understanding of like what the next thing they're doing leads to this grander goal we have.
And so how do you imagine such a process might play out with AIs?
right and then you also want some ability for information that you're learning i mean one of the things that makes humans quite different from these llms is that if you're onboarding on a job you're picking up so much context and information and that's what makes you useful at the job right you're uh everything from how your client as preferences to how the company works to everything
And is the bandwidth of information that you get from a procedure like TD learning high enough to have this huge pipe of context and tacit knowledge that you'd need to be picking up in the way humans do when they're just deployed?
Yeah.
So it seems to me you need two things.
One is some way of converting this long run goal reward into smaller auxiliary or, you know, these like predictive rewards of the future reward or the future reward, at least the final reward.
Then you need some other way.
Initially, it seems to me you need some way of then, OK, I'm
I need to hold on to all this context that I'm gaining as I'm working in the world, right?
I'm like learning about my clients, my company, all this information.
Yeah.
Yeah.
So...
The question I'm trying to ask is, you need some way of getting, like, how many bits per second are you picking up?
Like, is a human picking up when they're, you know, out in the world, right?
If you're just, like, interacting over Slack with your clients and everything.
Yeah.
So what is the learning process which helps you capture that information?
Yeah.
One of my friends, Toby Ward, pointed out that if you look at the Muse Euro models that Google DeepMind deployed to learn Atari games, that these models were initially not a general intelligence itself, but a general framework for training specialized intelligences.
to play specific games.
That is to say that you couldn't, using that framework, train a policy to play both chess and Go and some other game.
You had to train each one in a specialized way.
And he was wondering whether that implies that reinforcement learning generally, because of this information constraint, you can only learn one thing at a time, the density of information isn't that high, or whether it was just specific to the way that MuZero was done.
And if it's specific to AlphaZero, what needed to be changed about that approach so that it could be a general learning agent?
So maybe it would be useful to explain what was missing in that architecture or that approach, which this continual learning AGI would have.
Yeah, I guess I'm curious about...
Historically, have we seen the level of transfer using RL techniques that would be needed to build this kind of... Okay, good, good.
Let me paraphrase to make sure that I understood that correctly.
It sounds like you're saying that when we do have generalization in these models, that is a result of some sculpted... Humans did it.
Yeah.
I'm not trying to kickstart this initial crux again, but I'm just genuinely curious because I think I might be using the term differently.
I mean, one way to think about it is these LLMs are increasing the scope of generalization from like earlier systems, which could not really even do a basic math problem to now they can do anything in this class of math Olympia type problems, right?
So you initially start with like they can generalize among addition problems, at least.
Then you generalize to like they can generalize among like problems that require use of different kinds of mathematical techniques and theorems and conceptual categories, which is like what the math Olympiad requires.
And so it sounds like you don't think of being able to solve any problem within that category necessarily.
as an example of generalization?
Or let me know if I'm misunderstanding that.
My understanding is that this is working better and better with coding agents.
So engineers, obviously, if you're trying to program a library,
There's many different ways you could achieve the end spec.
And an initial frustration with these models has been that they'll do it in a way that's sloppy.
And then over time, they're getting better and better at coming up with the design architecture and the abstractions that developers find more satisfying.
And it seems an example of what you're talking about.
So to prep for this interview, I wanted to understand the full history of RL, starting with reinforce up to current techniques like GRPO.
And I didn't just want a list of equations and algorithms.
I wanted to really understand each change in this progression and the underlying motivation.
You know, what was the main problem that each successive method was actually trying to solve?
So I had Gemini Deep Research walk me through this entire timeline step by step.
It explained the last 20 years of gradual innovation and explained how each step made the RL learning process more stable or more sample efficient or more scalable.
I asked Deep Research to put all of this together like an Andrej Karpathy style tutorial.
And it did that.
What was cool is that it combined this whole lesson together into one coherent, cohesive document in the style that I wanted.
It was also great that it assembled all of the best links in the same place so that if I wanted to understand any specific algorithm better, I could just access the right explainer right there.
Go to gemini.google.com to try it out yourself.
All right, back to Richard.
I want to zoom out and ask about being in the field of AI for longer than almost anybody who is commentating on it or working in it now.
I'm just curious about what the biggest surprises have been, how much new stuff you feel like is coming out, or does it feel like people are just playing with old ideas?
Zooming out, you got into this even before deep learning was popular, so...
How do you see this trajectory of this field over time and how new ideas have come about and everything?
And what's been surprising?
Have there felt like whenever the public conception has been changed because some new technique was... Sorry, some new application was developed.
For example, when AlphaZero became this viral sensation, to you as somebody who has...
Literally came up with many of the techniques that were used.
Did it feel to you like new breakthroughs were made or does it feel like, oh, we've had these techniques since the 90s and people are simply combining them and applying them now?
Okay.
Some sort of left-field questions for you, if you'll tolerate them.
So the way I read the bitter lesson is that it's not saying necessarily that human artisanal researcher tuning doesn't work, but that...
it obviously scales much worse than compute, which is growing exponentially.
And so you want techniques which leverage the latter.
And once we have AGI, we'll have researchers which scale linearly with compute, right?
So we'll have this avalanche of millions of AI researchers and their stock will be growing as fast as compute.
And so maybe this will mean that it is rational or it will make sense to have
them doing good old fashioned AI and doing these artisanal solutions.
Does that, as a vision of what happens after AGI in terms of how AI research will evolve, I wonder if that's still compatible with a better lesson.
Well, how did we get to this AGI?
So suppose it started with general methods, but now we've got the AGI.
And now we want to go... But we're done.
Hmm?
We're done.
Interesting.
You don't think that there's anything above AGI?
Well, I'm using it to get superhuman levels of intelligence or competence at different tasks.
I guess there's different gradations of you.
So maybe one way to motivate this is AlphaGo is superhuman.
It beat any Go player.
AlphaZero would beat AlphaGo every single time.
So there's ways to get more superhuman than even superhuman.
And it was a different architecture.
And so it seems plausible to me that, well, the agent that's able to generally learn across all domains, there would be ways to make that, give it better architecture for learning, just the same way that AlphaZero was an improvement upon AlphaGo and MuZero was an improvement upon AlphaZero.
I agree that in that particular case, that it was moving to more general methods.
But I meant to use that example to illustrate that it's possible to go superhuman to superhuman plus plus to superhuman plus plus plus plus.
And I'm curious if you think those gradations will continue to happen by just making the method simpler.
Or because we'll have the capability of these millions of minds who can then add complexity as needed, that will continue to be a false path even when you have billions of AI researchers or trillions of AI researchers.
Yeah, for sure.
It's interesting that both quant firms and AI labs have a culture of secrecy because both of them are operating in incredibly competitive markets and their success rests on protecting their IP.
If you're an AI researcher or engineer and you're deciding where to work, most of the quant firms or AI labs that you'll be considering will be strongly siloing their teams to minimize the risk of leaks.
Hudson River Trading takes the opposite approach.
Their teams openly share their trading strategies, and their strategy code lives in a shared monorepo.
At HRT, if you're a researcher and you have a good idea, your contribution will be broadly deployed across all relevant strategies.
This gives your work a ton of leverage.
You'll also learn incredibly fast.
You can learn about other people's research and ask questions, and you can see how everything fits together end-to-end, from the low-level execution of trades to the high-level predictive models.
HRT is hiring.
If you want to learn more, go to HudsonRiverTrading.com slash Dwarkash.
All right, back to Richard.
I guess this brings us to the topic of AI succession.
You have a perspective that's quite different from a lot of people that I've interviewed and maybe a lot of people generally.
So I also think it's a very interesting perspective.
I want to hear about it.
Yeah, I agree with all four of those arguments and the implication.
And I also agree that succession contains a wide variety of possible futures.
So curious to get more thoughts on that.
It's interesting to consider if we were just designing another generation of humans.
Yes.
He designs the wrong word, but we knew a future generation of humans was going to come up.
And forget about AI.
We just know in the long run, humanity will be more capable and maybe more numerous, maybe more intelligent.
How do we feel about that?
I do think there's potential worlds with future humans that we would be quite concerned about.
That's what you're saying?
Something like that.
I'm basically taking the example you're giving of like, okay, even if you consider them part of humanity, I don't think that necessarily means that we should feel super comfortable.
Yeah.
Like Nazis were humans, right?
If we thought like, oh, the future generation will be Nazis, I think we'd be quite concerned about just handing off humanity.
power to them.
So I agree that this is not super dissimilar to worrying about more capable future humans.
But I don't think that that addresses a lot of the concerns people might have about this level of power being attained this fast with entities we don't fully understand.
Yeah, I guess there's different varieties of change.
The Industrial Revolution was change.
The Bolshevik Revolution was also change.
And if you were around in Russia in the 1900s and you're like, look, things aren't growing well.
The czar is kind of messing things up.
We need change.
I'd want to know what kind of change you wanted before signing on the dotted line, right?
And then similar with AI, where I'd want to understand, and to the extent it's possible to change the trajectory, to change the trajectory of AI such that the change is positive for humans.
Yeah, maybe a good analogy here would be, okay, so suppose you're raising your own children.
It might not be appropriate to have extremely tight goals for their own life or also have some sense of like, I want my children to go out there in the world and have this specific impact.
My son's going to become president and my daughter's going to become CEO of Intel and together they're going to have this effect on the world.
But people do have the sense that I think this is appropriate of saying, I'm going to give them good, robust values such that if and when they do end up in positions of power, they do reasonable pro-social things.
And I think maybe a similar attitude towards AI makes sense, not in the sense of we can predict everything that they will do, where we have this plan about what the world should look like in 100 years.
But it's quite important to give them...
robust and steerable and pro-social values pro-social values maybe that's the wrong word are there universal values that we can all agree on
I don't think so, but that doesn't prevent us from giving our kids a good education, right?
Like we have some sense of we want our children to be a certain way.
Yeah.
And maybe process is the wrong word.
Actually, high integrity is maybe a better word.
Where if there's a request or if there's a goal that seems harmful, they will refuse to engage in it.
Or they'll be honest.
Things like that.
And we have some sense that we can...
teach our children things like this, even if we don't have some sense of what true morality is or everybody doesn't agree on that.
And maybe that's a reasonable target for AI as well.
The more things change, the more they stay the same also seems like a good capstone to the AI discussion because the AI discussion we were having was about how techniques which were invented even before the application to deep learning and backpropagation was evident are central to the progression of AI today.
So maybe that's a good place to wrap up the conversation.
Thank you for coming on.
My pleasure.
Today, I'm chatting with Sergey Levin, who is a co-founder of Physical Intelligence, which is a robotics foundations model company, and also a professor at UC Berkeley, and just generally one of the world's leading researchers in robotics, RL, and AI.
Sergey, thank you for coming on the podcast.
SERGEY LEVIN Thank you, and thank you for the kind introduction.
Let's talk about robotics.
So before I pepper you with questions, I'm wondering if you can give the audience a summary of where physical intelligence is at right now.
You guys started a year ago.
And what does the progress look like?
What are you guys working on?
And what's the year-by-year vision?
So one year in, now I got a chance to watch some of the robots, and they can do pretty dexterous tasks, like folding a box using grippers.
And it's like, I don't know, it's pretty hard to fold a box, even with my hands.
If you had to go year-by-year until we get to the full robotics explosion, what is happening every single year?
What is the thing that needs to be unlocked, et cetera?
So this grand vision, what year, if you had to give a median estimate?
Yeah.
Or 25 percentile, 50, 75?
And something being out there means what?
Like what is out there?
We already have LLMs, which are like broadly deployed.
And that hasn't resulted in some sort of like flywheel.
At least not some obvious flywheel for the model companies where now Claude is learning how to do every single job in the economy or GPT is learning how to do every single job in the economy.
So why doesn't that flywheel work for LLMs?
Do you think it'll be easier for robotics or just that like this, the state of this kind of techniques to label data that you collect out in the world and use it as a reward will just, the sort of like, the whole wave will rise and robotics will rise as well?
Or is there some reason robotics will be, will benefit more from this?
Yeah.
So, okay, in one year we have robots which are, like, doing some useful things.
Maybe if you have some, like,
relatively simple loopy process, they can do it for you.
You've got to keep folding thousands of boxes or something.
But then there's some flywheel, dot, dot, dot.
There's some machine which will just run my house for me, as well as a human housekeeper would.
What is the gap between this thing, which will be deployed in a year, that starts the flywheel, and this thing, which is a fully autonomous housekeeper?
I get that there's a spectrum and I get that there won't be a specific moment that feels like we've achieved it.
But you've got to give a year in which like that, your median estimate of when that happens.
I'm just going to do binary search until I get a year.
Okay.
So it's less than 10 years.
So more than five years?
Your median estimate.
I know it's like a different range.
I think five is a good median.
Okay.
Five years.
So if you can fully autonomously run a house, then I think you've like – you can fully autonomously do most blue-collar work.
So your estimate is in five years it should be able to do most like blue-collar work in the economy.
I mean, separate from the question of whether people will get fired or not, a different question is like what will the economic impact be in five years?
Yeah.
The reason I'm curious about this is with LLMs,
the relationship between the revenues for these models to their inherent, their seeming capability has been sort of mysterious in the sense that, like, you have something which feels like AGI.
You can have a conversation where there really, like, is, like, you know, like, passes a soaring test.
It really feels like it can do all this knowledge work.
It's obviously doing a bunch of coding, et cetera.
But then the revenues for these AI companies are, like, cumulatively under the order of, like, $20, $30 billion per year.
And that's so much less than...
all knowledge work, which is $30, $40 trillion.
So in five years, are we in a similar situation that LLMs are now, or is it more like we have robots deployed everywhere and they're actually doing a whole bunch of real work, et cetera?
There's so many things that increase productivity.
Just like wearing gloves increases productivity or like, I don't know.
But then it's like you want to understand something which like increases productivity a hundredfold versus like, you know, wearing glasses or something which has like a small increase.
So robots already increase productivity for workers, right?
Where LLMs are right now in terms of the share of knowledge work they can do, which is,
It's, I guess, probably like one, one thousandth of the knowledge work that happens in the economy LLMs are doing, at least in terms of revenue.
Are you saying like that fraction will be possible for robots but for physical work in five years?
Because the human can label what's happening?
Interesting.
So I got to go to LabelBox and see the robotics setup and try operating some of the robots myself.
Okay, so operating ended up being a bit harder than I anticipated.
But I did get to see the LabelBox team rip through a bunch of tasks.
I also got to see the output data that labs actually have to use to train their robots and asked Manu, LabelBox's CEO, about how all of this is packaged together.
Labelbox can get you millions of episodes of robotics data for every single robotics platform and subtasks that you want to train on.
And if you reach out through labelbox.com slash thwarkash, Manu will be very happy with me.
In terms of robotics progress, why won't it be like self-driving cars where we – it's been more than 10 years since Google launched its – wasn't it 2009 that they launched the self-driving car initiative?
And then I remember when I was a teenager like watching demos where we would go buy a Taco Bell –
and drive back.
And only now do we have them actually deployed.
And even then, you know, they may make mistakes, et cetera.
And so maybe it'll be many more years before most of the cars are self-driving.
So why wouldn't robotics, you know, you're saying five years to this, like, quite robust thing, but actually it'll just feel like 20 years of just, like...
Once we get the cool demo in five years, then it'll be another 10 years before we have the Waymo and the Tesla FSD working.
So for years using, I mean, not since 2009, but we've had lots of video data, language data, and transformers for five, seven, eight years.
And lots of companies have tried to build transformer-based robots with lots of training data, including Google, Meta, et cetera.
And what is the reason that they've been hitting roadblocks?
What has changed now?
What is preventing you now from scaling that data even more?
If data is a big bottleneck, why can't you just increase the size of your office 100x, have 100x more operators?
We're operating these robots and collecting more data.
Why not ramp it up immediately 100x more?
Just to give an order of magnitude, how does the amount of data you have collected compare to internet-scale pre-training data?
And I know it's hard to do, like, a token-by-token count because, yeah, how does video information compare to internet information, et cetera?
But, like...
Using your reasonable estimates, what fraction of... That's right.
When you say self-sustaining, is it just like learning on the job or do you have something else in mind?
Right.
Yeah.
Okay.
And how does the pie model work?
And like what is actually happening is that it's like predicting I should do X thing, then it's like there's an image token, then some action tokens, like what it actually ends up doing, and then more image, more text description, more action tokens.
Basically, I'm like looking at what stream is going on.
Right.
I find it super interesting that, so I think you're using the open source Gemma model, which is like Google's LLM, the release open source, and then adding the section expert on top.
I find it super interesting that the progress in different areas of AI is just based on not only the same techniques, but literally the same model that you can just use an open source LLM and then add a section expert on top.
It is notable that like you naively might think that, oh, there's a separate area of research which is robotics and there's a separate area of research called LLMs and natural language processing.
And no, it's like it's literally the same.
It's like the considerations are the same.
The architectures are the same.
Even the weights are the same.
I know you do more training on top of these open source models, but that I find super interesting.
Today, I'm here with Mark, who is a senior researcher at Hudson River Trading.
He has prepared for us a big data set of market prices and historical market data.
And we're going to try to figure out what's going on and whether we can predict future prices from historical market data.
Mark, let's dig in.
Happy to do it.
So it sounds like the first fun thing to do is probably to start looking at what an order book actually looks like.
If this sounds interesting to you, you should consider working at Hudson River Trading.
I was talking to this researcher, Sander, at GDM, and he works on video and audio models.
And he made the interesting point that the reason, in his view, we aren't seeing that much transfer learning between different modalities, that is to say, like training a language model on video and images, doesn't seem to necessarily make it that much better at textual learning.
questions and tasks, is that images are represented at a different semantic level than text.
And so his argument is that text has this high-level semantic representation within the model, whereas images and videos are just like compressed pixels.
There's not really a semantic... When they're embedded, they don't represent some high-level semantic information.
They're just like compressed pixels.
And therefore, there's...
there's no transfer learning at the level at which they're going through the model.
And obviously, this is super relevant to the work you're doing because your hope is that by training the model both on the visual data that the robot sees, visual data generally, maybe even from YouTube or whatever eventually, plus language information, plus action information from the robot itself, all of this together will make it generally robust.
And then you had a really interesting blog post about why video models aren't as robust as language models.
Sorry, this is not a super well-formed question.
I just wanted you to react to that.
By the way, the fact that video models aren't as robust, is that bearish for robotics?
Because it will, so much of the data you will have to use will not, I guess some of, you're saying a lot of it will be labeled, but like, ideally you just want to be able to like throw all of everything on YouTube, every video we ever recorded and have it learn how the physical world works and how to like move about, et cetera, just see humans performing tasks and learn from that.
But if, yeah, I guess you're saying like it's hard to learn just from that and it actually needs to practice the task itself.
famously LLMs have all these emergent capabilities that were never engineered in because somewhere in internet text is the data to train and to give it the knowledge to do a certain kind of thing.
With robots, it seems like you are collecting all the data manually.
So there won't be this mysterious new capability that like is somewhere in the data set that you haven't purposefully collected, which seems like it should make it even harder to then have
robust out-of-distribution kind of capabilities.
And so I wonder if the Trek over the next 5-10 years will just be like
Each subtask, you have to give it thousands of episodes, and then it's very hard to actually automate much work just by doing subtasks.
So if you think about what a barista does, what a waiter does, what a chef does, very little bit involves just sitting at one station and doing stuff, right?
You've got to move around, you've got to restock, you've got to fix the machine, et cetera, go between the counter and the cashier and the machine, et cetera.
So...
Will there just be this long tail of things that you had to keep, skills you had to keep, like adding episodes for manually and labeling and seeing how well they did, etc.?
Or is there some reason to think that it will progress more generally than that?
Right.
I had an example like this when I got a tour of the robots, by the way, at your office.
So it was folding shorts.
And I don't know if there was an episode like this in the –
in the training set, but just for fun, I took one of the shorts and turned it inside out.
And then it was able to understand that it first needed to get... So first of all, the grippers are just like this, like two limbs, or just a poseable finger and thumb-like thing.
And it's actually shocking how much you can do with just that.
Yeah, it understood that it first needed to fold it inside out before folding it correctly.
I mean, what's especially surprising about that is...
It seems like this model only has one second of context.
So as compared to these language models, which can often see the entire code base, and they're observing hundreds of thousands of tokens and thinking about them before outputting, and they're observing their own chain of thought for thousands of tokens before making a plan about how to code something up, your model is seeing one image of what happened in the last second.
And it vaguely knows it's supposed to fold this short.
And it's seeing the image of what's happened in the last second.
And I guess it works.
It's crazy that it will just see the last thing that happened and then keep executing on the plan.
So fold it inside out, then fold it correctly.
But it's shocking that a second of context is enough to execute on a minute-long task.
Yeah, I'm curious why you made that choice in the first place and why it's possible to actually do tasks.
If a human could only think ahead a second of memory and had to do physical work, I feel like that would just be impossible.
And how physically will – so you have this like trilemma.
You have three different things which all take more compute during inference that you want to increase at the same time.
You have the inference speed.
And so humans are processing 24 frames a second or whatever it is.
We're just like – we can react to things extremely fast.
Then you have the context length.
And for, I think, the kind of robot which is just like cleaning up your house, I think it has to kind of... It has to be aware of things that happened minutes ago or hours ago and how that influences its plan about the next task it's doing.
And then you have the model size.
And I guess at least with LLMs, we've seen that there's gains from increasing the amount of parameters.
And I think currently you have...
100 millisecond inference speeds.
You have a second long context.
And then the model is what?
A couple billion parameters?
How many?
Okay.
And so each of these, at least two of them, are many orders of magnitude smaller than what seems to be the human equivalent, right?
Like, if a human brain has, like, trillions of parameters, and this has, like, two billion parameters, and then if humans are processing...
at least as fast as this model, like actually a decent bit faster.
And we have hours of context.
It depends on how you define human context, but hours of context, minutes of context.
Sometimes decades of context.
Yeah, exactly.
So you have to have many order of magnitude improvements across all of these three things, which seem to oppose each other, or like increasing one reduces the amount of computer you can dedicate towards the other one in inference.
How are we going to solve this?
Ooh, do you mean in terms of...
How we represent?
Another question I have as we're discussing these tough trade-offs in terms of inference is comparing it to the human brain and figuring out the human brain is able to have hours, decades of context while being able to act on the order of 10 milliseconds while having 100 trillion parameters or however you want to count it.
And I wonder if the best way to understand what's happening here is
is that human brain hardware is just way more advanced than the hardware we have in GPUs, or that the algorithms for encoding video information are like way more efficient.
And maybe it's like some crazy mixture of experts where the active parameters is also on the order of billions, low billions, or some mixture of the two.
Basically, if you had to think about like, why do we have these models that are,
across many dimensions, orders of magnitude, less efficient?
Is it harder or algorithms compared to the brain?
I'm sure you've been seeing a bunch of fun images that people have been generating with Google's new image generation model, Nanobanana.
My XFeed is full of wild images.
But you might not realize that this model can also help you do less flashy tasks like restoring historical pictures or even just cleaning up images.
For example, I was reading this old paperback as I was prepping to interview Sarah Payne, and it had this really great graph of World War II Allied shipping that I wanted to overlay in the lecture.
Now, in the past, this would have taken one of my editors 20 or 30 minutes to digitize and clean up manually.
But now, we just took a photo of the page and then dropped it into Nanobanana and got back a clean version.
This was a one-shot.
But if Nanobanana doesn't nail it on the first attempt, you can try to just go back and forth with it until you get a result that you're super happy with.
We keep finding new use cases for this model.
And honestly, this is one of those tools that just doesn't feel real.
Check out Gemini 2.5 Flash Image Model, aka Nanobanana, on both Google AI Studio and the Gemini app.
All right, back to Sergey.
If in five years we have a system which is like as robust as a human in terms of interacting with the world,
then what has happened that makes it physically possible to be able to run those kinds of models?
To have video information that is streaming at real time or hours of prior video information is somehow being encoded and considered while decoding in like a millisecond scale and with many more parameters.
Is it just that like NVIDIA has shipped much better GPUs or that you guys have come up with much better like encoders and stuff or like what's happened in the five years?
Yeah.
Yeah, maybe you guys just need to hire, like, the people who run the YouTube data centers because, like, they know how to encode video information.
Okay, this is actually an interesting question, which is that with LLMs, of course,
They're being... Theoretically, you could run your own model on this laptop or whatever, but realistically, what happens is that the largest, most effective models are being run in batches of thousands, millions of users at the same time, not locally.
Well, the same thing happened in robotics because of the inherent deficiencies of batching, plus the fact that
we have to do this incredibly compute-intensive inference task.
And so you don't want to be carrying around, like, $50,000 GPUs per robot or something.
You just want that to happen somewhere else.
So, yeah, this robotics world, should we just be anticipating something where you need connectivity everywhere, you need robots that have, like, super fast... And you're streaming video information back and forth, right?
Or at least video information one way, so...
Does that have interesting implications about how this deployment of robots will actually be instantiated?
And then so you have a couple lectures from a few years back where you say, like, even for robotics, RL is in many cases better than imitation learning.
But so far, the models are exclusively doing imitation learning.
So I'm curious how your thinking on this has changed, or maybe it's not changed, but then you need to do this for the RL.
Like, why can't you do RL yet?
In 10 years, will the best model for knowledge work also be a robotics model or have like an action expert attached to it?
And the reason I ask is like, so far we've seen advantages from using more general models for things.
And will robotics fall into this bucket of we will just have the model which does everything, including physical work and knowledge work?
Or do you think they'll continue to stay separate?
I guess there might be other considerations that are relevant to physical robots in terms of, like, inference speed and model size, et cetera, which might be different than the considerations for knowledge work.
But then maybe you can... Maybe that doesn't change.
Maybe it's still the same model, but then you can serve it in different ways.
And the advantages of co-training are high enough that...
Yeah.
Whenever I'm like I'm wondering in five years if I'm using a model to code for me, does it also know how to do robotic stuff?
And yeah, maybe the advantages of code trading on robotics are high enough that it's worth.
I'm a bit confused about why simulation doesn't work better for robots.
If I look at humans, smart humans do a good job of, if they're intentionally trying to learn, noticing what about the simulation is similar to real life and paying attention to that and learning from that.
So if you have pilots who are learning in simulation or F1 drivers who are learning in simulation, should it be expected to be a case that as robots get smarter,
they will also be able to learn more things through simulation?
Or is this cursed and we need real-world data forever?
Can you do some kind of meta RL on this, which is like almost identical actually to the, there's this really interesting paper you wrote in 2017 where maybe the loss function is not how well it does at a particular video game or particular simulation, but how well being trained in different video games makes it better at some other downstream task.
I did a terrible job explaining.
I understand what you mean.
Okay, maybe – can you do a better job explaining what I was trying to say?
And then specifically make that the loss function, right?
So once we have this like in 2035, 2030, basically the sci-fi world,
Are you optimistic about the ability of, like, true AGIs to build simulations in which they are rehearsing skills that no human or AI has ever had a chance to practice before?
Some, you know, they need to, like, practice to be astronauts because we're building the Dyson sphere, and they can just do that in simulation.
Or, like, will the issue with simulation continue to be one regardless of how smart the models get?
Do you have some sense of what the equivalent is in humans?
Like, whatever we're doing when we're daydreaming or sleeping or... I don't know if you have some sense of, like, what this auxiliary thing we're doing is, but...
if you had to make an ML analogy for it, what is it?
Yeah, interesting.
So stepping big picture again, the reason I'm interested in getting concrete understanding of when this robot economy will be deployed is because it's actually pretty relevant to understanding how fast AGI will proceed.
In the sense that, well, it's obviously the data flywheel.
But also, if you just extrapolate out the capex for AI, suppose by 2030, people have different estimates.
But many people have estimates in the hundreds of gigawatts, 100, 200, 300 gigawatts.
And then you can just crunch numbers on if you have 200 gigawatts deployed or 100 gigawatts deployed by 2030.
The marginal capex per year is like trillions of dollars.
It's like $2, $3, $4 trillion a year.
And that corresponds to actual data centers you have to build, actual chip foundries you have to build, actual solar panel factories you have to build.
And I'm very curious about whether by this time, by 2030, if the big bottleneck we have is just like people to like lay out the solar panels next to the data center or assemble the data center, whether the robot economy will be mature enough to –
helps significantly in that process.
Right.
But will they be able to by the time that like – there's some – like there's the non-robotic stuff which will also like mandate a lot of CapEx.
And then there's robot stuff where you actually have to build robot factories, et cetera.
But anyway, there will be this industrial explosion across the whole stack.
And how much will robotics be able to speed that up or make it possible?
And then do you have a sense of how... So there's like, where will the software be?
And then there's a question of how many physical robots will we have?
So...
Like, how many of the kinds of robots you're training in physical intelligence, like these tabletop arms, are there physically in the world?
How many will there be by 2030?
How many will be needed?
I mean, these are tough questions.
Like, how many will be needed for that?
Interesting.
Okay, so do you think the learning rate will continue?
Do you think it will cost hundreds of dollars by the end of the decade to buy mobile arms?
Okay.
And how many arms are there probably in the world?
Is it more than a million, less than a million?
So the kind you want to train on?
So like less than 100,000?
I don't know, but probably, yeah.
Okay.
And we want billions of robots, like at least millions of robots.
if you're just thinking about like the industrial explosion that you need to, um, have this AI exclusive growth, um, not only do you need the arms, but then you need like something that can move around, um,
Basically, I'm just trying to think about like will that be possible by the time that you need a lot more labor to power this AI boom?
As manufacturers, is there some NVIDIA of robotics?
What is the biggest bottleneck in the hardware today as somebody who's designing the algorithms that run on it?
Okay, so this is a question I've had for a lot of guests, and is that if you go through any layer of this AI explosion, you find that...
a bunch of the actual source supply chain is being manufactured in China.
So other than chips, obviously.
But then, you know, if you talk about data centers and you're like, oh, all the wafers for solar panels and a bunch of the cells and modules, et cetera, are manufactured in China, then you just go through the supply chain.
And then, obviously, robot arms are being manufactured in China.
And so if you live in this world where the...
Hardware is just incredibly valuable to ramp up manufacturing of because each robot can produce some fraction of the value that a human worker can produce.
And not only is that true, but the value of human workers or any kind of worker has just tremendously skyrocketed because we just need tons of bodies to lay out the tens of thousands of acres of solar farms and data centers and foundries and everything.
in this boom world, the big bottleneck there is just, like, how many robots can you physically deploy?
How many can you manufacture?
Because you guys are going to come up with the algorithms, now we just need the hardware.
And so...
This is a question I've asked many guests, which is that if you look at the part of the chain that you are observing, what is the reason that China just doesn't win by default?
If they're producing all the robots and you come up with the algorithms that make those robots super valuable, why don't they just win by default?
I mean, yeah, I guess there's a different question, which is that if the value is sort of bottlenecked by hardware, and so you just need to produce more hardware, what is the path by which hundreds of millions of robots or billions of robots are being manufactured in the US or with allies?
I don't know how to approach that question, but it seems like a different question than like, okay, well, what is the impact on like human wages or something?
Right.
I guess feedback loops go both ways.
They can help you or they can help others.
And it's a positive in some worlds.
It's not necessarily bad that they help others.
But –
To the extent that a lot of the things which would go into this feedback loop, the subcomponent manufacturing and supply chain, already exist in China, it seems like the stronger feedback loop would exist in China.
And then there's a separate discussion, like maybe that's fine, maybe that's good, and maybe they'll continue exporting this to us.
But it's just like notable that – I just find it notable that whenever I talk to guests about different things, it's just like, oh, yeah, that – you know, within a few years, the key bottleneck to every single part of the supply chain here will be something that China is like the 80 percent world supplier of something.
I do think from the perspective of society as a whole, how should they be thinking about the advances in robotics and knowledge work?
And I think it's basically like society should be planning for full automation.
Like there will be a period in which people's work is way more valuable because there's this huge boom in the economy where like building all these data centers or building all these factories.
But then eventually humans can do things with their body and we can do things with our mind.
There's not like some secret third thing.
So what should society be planning for?
It should be
full automation of humans and there will also be a society being much wealthier so presumably there's ways to do this in a way that like everybody is much better off than they are today but then like the end state the light at the end of the tunnel is the full automation plus super wealthy society with some redistribution or whatever way to figure that out right i don't know if you disagree with that characterization
Is that true?
I mean, the Moravec paradox is like the things which are like most beneficial from education for humans might be the easiest to automate because it's really easy to educate AIs.
You can throw the textbooks that would take you eight years of grad school to do at them in an afternoon.
Okay, Sergey, thank you so much for coming on the podcast.
Thank you.
Super fascinating.
Tough questions.
I hope you enjoyed this episode.
If you did, the most helpful thing you can do is just share it with other people who you think might enjoy it.
Send it to your friends, your group chats, Twitter, wherever else.
Just let the word go forth.
Other than that, super helpful if you can subscribe on YouTube and leave a five-star review on Apple Podcasts and Spotify.
Check out the sponsors in the description below.
If you want to sponsor a future episode, go to dwarkash.com slash advertise.
Thank you for tuning in.
I'll see you on the next one.
If you're developing frontier models, you know that high-quality data sets the ceiling on performance.
LabelBox is the rare data partner that works hand-in-hand with researchers throughout their entire process.
Not only will they get you the data that you need, they'll also help you figure out what data is needed to make your research ideas work in the first place.
To give you an example, I've talked before about why I think that fully automated computer use will be difficult.
You have to build thousands of environments where language models can practice long horizon problems using common pieces of software.
Labelbox is already building these environments for leading frontier labs.
They know how to resist reward hacking and structure problems at just the right level of difficulty to push models beyond their current limits.
They also help you track model performance so you can always know how your model is learning and also where it still needs work.
LabelBox delivers high-quality frontier data regardless of domain, modality, or training paradigm.
It doesn't matter if you need RL environments for agents, audio for multi-turn voice, action data for robotics, or problem sets for physics.
LabelBox can get you the best data and fast.
Learn more at labelbox.com.
All right, back to Sarah.
I've been wanting a custom research tool that I can use to prep for upcoming interviews, something that will summarize papers that I need to read and generate flashcards and suggest questions that I can ask the guest.
So I decided to try just building it myself using Warp.
And honestly, I just really love the experience of working with it.
The way I would describe Warp is that it's like Cloud Code, but with a much nicer UI.
I got most of the application working after just a one-shot, and then Warp made it really easy to iterate with the agent to fix minor bugs and to improve the UI.
I can see exactly how the edits are being made in the text files themselves just by clicking the expand icon, and if I needed to change a variable or something, I could just do it through the inline editor.
The navigation is way nicer than if it was a native terminal-based tool.
It's much easier to scroll through everything and understand all of the model's actions.
It's exactly the right interface for coding with agents.
You can try Warp today by going to warp.dev slash Dwarkash.
And with the code Dwarkash, you can get the Warp Pro plan for only $5.
Okay, back to Sarah.
Okay.
First question.
The lecture is framed as explaining Britain's strategic wins in how they prosecuted World War II.
But it seems like we really have to explain...
What seems to keep happening is that Germany and Hitler specifically keep making mistakes.
So 1940, 1941, Halifax is saying, look, we need to come to a peace with Germany.
We've lost the Battle of France.
We've had to evacuate Dunkirk.
And then Hitler could have done a blockade of Britain and could have just kept prosecuting the war.
But instead, he decides to open up a second front against the other biggest army in the entire world, against the Soviet Union.
And then when Japan does Pearl Harbor, which again was an unforced error, he makes another error on top of that of declaring war on America, which he didn't have to do, and which made it much easier for America to lend support on the war against Germany.
So this seems much more about Hitler continually bungling than about Britain getting the strategic picture right.
But I want to come back to the question of how much, when we're trying to explain who wins a big war like this, how much should we be looking at the specific strategies used in different battles versus just the total tonnage of industrial output that America especially contributed?
And so by 1942, what is the way, even if they had much better strategy, what is the way in which Germany would have won?
I found your thesis that Britain fought this very effective maritime strategy really interesting.
But let me just try out a different theory for you.
Here we go.
Yes.
So after the fall of France in June 1940, Germany has all these U-boats and they're actually themselves running a very effective maritime strategy.
They're not in the Atlantic.
They're not in the narrow seas, right?
They can actually go to the ocean and cut off British trade through the Atlantic and
In October, I think, October of 1940, I think they sink like close to 400 or 500,000 tons.
It's huge.
Of British shipping.
So, you know, the Germans are running this like a very effective blockade of Britain.
And the reason it fails is just that the Americans eventually can just produce...
many more ships than the Germans can even sing.
And so ultimately, it wasn't a matter of who ran the better maritime strategy.
It was just that who could keep up the industrial output to sustain the strategy.
So it comes back to industry and not strategy.
I guess the question then is, what is changing things mean?
Does it mean that...
the war could have gone on longer, shorter, et cetera.
Yes, of course, anything would have changed that.
There's another question of, would it have changed who won the war?
And I'd claim that cryptography, radar, even oil, even if Germany had a huge reserve of oil,
Maybe it would have lengthened the length at which Germany could have sustained itself.
But if you change the fact that the Axis had one fourth of the GDP of the Allies combined, I think that it's switching that genuinely changes who wins the war, whereas these other things actually wouldn't change who won the war.
When Germany and the Soviet Union split up Poland...
There's this just 2000 kilometer border from the Baltics to the Black Sea that they've signed themselves up for, for sharing with the other largest army that has ever been assembled.
Did the Soviet Union think that in the long run that this was a sustainable proposition or did they just think that we could we could wait Hitler out or in the long run, you know, we'd be in a better position to wage war against him?
They did.
Did people think that this was a sustainable arrangement?
You described Hitler's strategy as pursuing a continental strategy, and that was sort of his undoing.
But it seems like there's many other wars where a more reasonable person can pursue a continental strategy and do just fine.
So Bismarck does multiple continental wars, Franco-Prussian War, Austro-Prussian War, and achieves his objectives.
And Hitler is just freaking crazy.
Like he continuously does things.
He knows that going to war with Poland in 1939, France and Britain will join.
He's like, I will fight both of them for Poland for some reason.
Yeah.
Somehow that works.
Then he gets, somehow France falls in a few weeks.
Instead of stopping there, he decides, now I'm going to go to war with the whole of the Soviet Union.
He decides, in what way is it a continental strategy to then declare war on America, which is an ocean away.
It's not even an adjacent continental power.
So it seems just like him being crazy explains more than him pursuing a continental strategy.
I interviewed last year Daniel Yorgan, who wrote this big history of oil.
And he has like 300 pages in his book about World War II.
And specifically that his claim is Barbara Rosa was motivated by Hitler's desire to get the oil fields back.
Is there any world in which Hitler would have surrendered by 1943 when it was clear that the war was going to go the other way?
I mean, in World War I, you do have a lot of powers which have at least some amount of franchise.
And in that war, I think people should have just—it would have been much better to lose the war in 1914 or 15 than to continue waging it until 1917.
But even Britain doesn't back down, or France doesn't back down.
The reason I brought up earlier the primacy of industrial output and population size in determining who wins the war is at the end, you sort of made the comparison that Russia and China today are in a similar position as the Axis was preceding World War II, or at least the disadvantages that they had.
Right.
Yeah.
Then it depends.
Well, do you think that was more...
central than the fact that during World War II, the Allies just said three to five times the combined output of the Axis.
And so today, if you looked at the manufacturing output of China versus the United States, the shipbuilding capacity of China versus the United States, even if you looked at the international trade of China, like China has more international trade than the United States does, then you ask who is in a more similar position
to the United States during World War II?
Is it China or America?
I mean, the most likely thing would be a war over Taiwan.
Okay, so your idea— The value of the object is clearly higher to China there.
But it's precisely because it has such a dominant position in trade that it would be difficult to coordinate a coalition against them.
If the entire world benefits more from trading with them than the United States, which there's more volume of trade with China than the United States globally, then it would be difficult to coordinate a coalition to say, look, you've got to stop trading with the person you're trading most with.
I guess the larger thing I'm trying to get at is
In the immediate term, it seems like if the thing that mattered in World War II was the fact that the United States was outproducing all the access combined in planes and munitions and tanks, and then who is in a comparable position today?
Well, you'd say it's China.
And then over the long run, perhaps we can collaborate together coalition, which over many generations reduces China's growth or ability to compete.
But then I'm not even confident that the coalition would, you know, center around the United States rather than on China.
On the general framework of how should continental powers act and how should maritime powers act, I wonder if it's worth even designating the optimal strategy based on your geography versus to the extent that you think positive some trade is better than invading and doing bloody wars.
That's true regardless of whether you have a maritime or
I mean, if you are the biggest trading partner with, say, 100 countries, do you have a lot of alliances?
I mean, if not, then that seems more significant than just calling yourself an ally.
You just have an incredibly strong vested interest
in the other country.
I mean, the reason America has a lot of alliances is because we have a lot of trade.
Is that true, though?
I mean, like, what is it they actually tell?
Yeah.
I mean, one thing I often hear in these discussions is that it's the United States which actually has stronger preferences in, for example, how your political system works, which may be reasonable to have, but China is usually more willing to just trade with you if you're willing to trade.
I think that's an excellent place to close.
Sarah, thanks so much.
Thank you.
Thank you for tuning in.
I'll see you on the next one.
Today, I have the pleasure of chatting with Jacob Kimmel, who is president and co-founder of New Limit, where they epigenetically reprogram cells to their younger states.
Jacob, thanks so much for coming on the podcast.
Thanks so much for having me.
Looking forward to the conversation.
All right.
First question, what's the first principles argument for why evolution just like discards that so easily?
Look, I know evolution cares about our kids, but if we have longer, healthier lifespans, we can have more kids, right?
Or we can care for them longer.
We can care for our grandkids.
So is there some pleiotropic effect that anti-aging medicine would have, which actually selects against you staying young for longer?
Mm-hmm.
Right, by the way, just on that, often people who are trying to forecast AI
we'll discuss basically how hard did evolution try to optimize for intelligence and what were the things which optimizing for intelligence would have prevented evolution for selecting for at the same time, which would make it so that even if intelligence were a relatively easy thing to build in this universe,
it would have taken evolution so long to get at human-level intelligence.
And potentially, if intelligence would be really easy, then it might imply that we're going to get to superintelligence and Jupiter-level intelligence, et cetera, et cetera.
The sky's the limit.
So one argument, like birth canal sizes, et cetera, or the fact that we had to spend most of our resources on the immune system,
But what you just hinted at is actually an independent argument that if you have this high hazard rate, that would imply that you can't be a kid for too long because you got to, you know, the kids die all the time and you got to become an adult so that you can have your kids.
Like if you're just hanging out learning stuff for 50 years, you're just going to die before you get to have kids yourself.
So obviously humans have bigger brains than other primates.
We also have longer adolescences, which help us make use potentially of the extra capacity our brain gives us.
But if you made the adolescence too big, then you would just die before you get to have kids.
And if that's going to happen anyways, what's the point of making the brain bigger?
AKA, you know, maybe intelligence is easier than we think.
And there's a bunch of contingent reasons that evolution didn't turn as hard on this variable as it could have.
So in one way, this is actually a very interesting RL problem, right?
It's a long horizon RL problem, 20-year horizon length, and then there's a scalar value of how many kids you have.
I guess that's a vibe, et cetera.
And if you know how hard, or I don't know, but if you've heard from your friends about how hard RL is on these models for just very intermediate goals that last an hour or a couple hours, it's actually surprising that
any signal propagates across a 20-year horizon.
By the way, on the point about fluid intelligence peaking, so not only is it the case that in many fields achievement peaks before 30, in many cases, if you look at the greatest scientists ever, they had many of their greatest achievements in a single year.
So...
Yeah, yeah, exactly.
Yeah, exactly.
Yeah, Newton, what is it?
Optics, gravity, calculus, 21.
Yeah.
Interesting.
So that's one out of three components of the evolutionary story.
I love this idea of aging as a length regularizer.
So I think people might be familiar with the idea that when companies are training models, they'll have a regularizer for you can do chain of thought, but don't make the chain of thought too long.
And then you're saying like how many calories you consume over the course of your life is that one such regularizer.
Yeah.
That's interesting.
Okay.
And then the third point was?
The third piece is basically optimization constraints.
Antibiotics are an even more clear case of that because here's something that evolution actually cares a lot about, right?
So it feels like antibiotics should have been— Why didn't humans evolve their own antibiotics?
Actually, that should imply that there's
through evolutionary history, millions of quote unquote naive antibiotics, which could have acted as antibiotics, but now basically all the bacteria have evolved around it.
Do we see evidence of these like historical antibiotics that some fungi came up with and bacteria evolved around and there's evidence or remnant in their DNA?
Isn't the mutation rate per base pair per generation like one in a billion or something?
It's quite low.
So you're saying that in our genomes, we can find some extended sequence which encodes how to bind specifically to the kind of virus that SIV is and the amount of evolutionary signal you would need in order to have a multiple base pair sequence.
So each nucleotide consecutively would have to mutate in order to finally get the sequence that binds to SIV.
That seems almost implausible that you could – I mean, I guess evolution works.
So, like, we can come up with new genes, right?
But, like, how would that even work?
So you
You're saying that even though the per base pair mutation rate might be one in a billion, if you've got 100 copies of a gene, then the sort of like mutation rate on a gene or on a low hamming distance sequence to the one you're aiming for might actually be quite high.
And so you can actually get the target sequence.
conceptually, is there, um, some phylogenetic tree of gene families where you've got the transposons and you've got, like, the gene itself, but then you've got, like, the descendant genes, which are, like, low Hamming distance, um,
I don't know.
Is there like some conceptual way in which they're categorized?
Man, this is fascinating.
Okay, back to aging.
You'll have to cancel your evening plans.
I've got so many questions for you.
Keep going.
So the second reason you gave, which was that...
There's selective pressure against people who get old, but...
Still keep living, but they're like slightly less fit.
They're suboptimal from a calorie input perspective.
And that's how people love thinking about their grandpas, you know.
Yeah.
Suboptimal from a calorie.
Suboptimal calorie provider right there.
Anyways, so a concern you might have about the effects of longevity treatments on your own body is that you will fix some part of the aging process, but not the whole thing.
Right.
It seems like you're saying that you actually think this is the default way in which an anti-aging procedure would work because that's the reason evolution didn't optimize it for it.
It's just like, we're only fixing half of the aging process and not the whole thing.
Whereas sometimes I hear longevity proponents be like, no, we'll get the whole thing.
There's going to be a source that explains all of aging and we'll get it.
Whereas your evolutionary argument for why evolution didn't optimize against aging relies on the fact that aging actually is not monocausal and evolution didn't bother to just fix one cause of aging.
What would the AI foundation model for trading and finance look like?
It would have to be what LLMs are to NLP or what the virtual cell is for biology.
And it would have to integrate every single kind of information from around the world, from order books to geopolitics.
Now, think about how insane this training objective is.
Here's this constantly changing RL environment with input data that's incredibly easy to overfit to, where you're pitted against extremely sophisticated agents who are learning from your behavior and plotting against it.
Obviously, there's very few things in the world that are as complex as global capital allocation.
It's a system that reflects billions of live decisions in real time.
Now, as you might imagine, trading in AI to do all of this is a compute-intensive task.
That's why Hudson River Trading continually upgrades its massive in-house cluster with fresh racks of brand new B200s being installed as we speak and more on the way.
HRT executes about 15% of all U.S.
trading equities volume, and researchers there get compensated for the massive upside that they create.
If the newest researcher on the team improves an HRT model, their contributions are recognized and rewarded right away, regardless of their tenure.
If you want to work on high stakes, unsolved problems, unconstrained by your GPU budget, check out HRT at hudsonrivertrading.com.
All right, back to Jacob.
All right, so evolution didn't select for aging.
What are you doing?
What's your approach to new limit that you think is likely to find the true cause of aging?
Mm-hmm.
If you're just making these broad changes to a cell state through these transcription factors, which have many effects, are there other aspects of a cell state that are likely to get modified at the same time in a way that would be deleterious?
Or would it be a sort of straightforward effect on cell state?
Okay, this is a dumb question, but it will help me understand why an AI model is necessary to do any of this work.
So you mentioned the Yamanaka factors.
From my understanding, the way he identified these four transcription factors was that he found the 24...
transcription factors that have high expression in embryonic cells, and then he just turned them all on in a somatic cell.
Basically, he systematically removed from this set until he found the minimal set that still induces a cell to become a stem cell.
And that just doesn't require any fancy AI models, etc.,
Why can't we do the same things for the transcription factors that are associated with younger cells or express more in younger cells as opposed to older cells and then keep eliminating from them until we find the ones that are necessary to just make a cell young?
I wish it were so easy.
They'll grow, you know?
Yeah, yeah.
So we can think of these transcription factors as these basis directions, and you can get like a little bit of this thing, a little bit of that thing, and some combination.
And evolution has designed these transcription factors to... Is that your claim?
To have relatively modular, self-contained effects that work in predictable ways with other transcription factors.
And so we can use that same handle to...
To our own ends?
Yeah, yeah.
Interesting.
You're just like designing a little aura that goes on top of... Yeah, yeah, in a way.
Yeah, you're sort of hinting that if we analogize it to some code base, we're going to find a couple of lines that are like commented out that's like de-aging, you know, and then like un-hyphen or un-parentheses.
You're flattering our listeners.
Only cringe listeners will appreciate it, but your audience will love this.
Interesting.
Yeah, a previous guest and a mutual friend, Trenton Bricken, had a paper in grad school about how the brain implements attention.
Really?
It's so funny the way we're going to learn how the brain works is just like trying to first principles engineer intelligence and AI.
And then it just happens to be the case that each one of these things has a neural correlate.
Gemini CLI just one-shotted an automated producer for me in one hour.
Basically, I went at this interface where I could just paste in a raw episode transcript and then get suggestions for Twitter clips and titles and descriptions and some other copy, all of which cumulatively takes me about half a day to write.
Honestly, it was just extremely good.
I described the app I wanted and then asked Gemini to talk through how it would go about implementing it.
It walked through its plans, it asked me for input where I hadn't been sufficiently clear, and after we ironed out all the details, Gemini just literally one-shotted the full working application.
with fully functional backend logic.
Making this app literally took 10 minutes, including installing CLI.
Then I spent 15 minutes fine-tuning the UI, messing around.
By the way, this process did not involve me actually editing or even looking at any of the code.
I would just tell Gemini how I wanted things moved around, and the whole UI would change as Gemini rewrote the files.
Despite building and then fine-tuning an entire working application, the session context didn't even get 10 percent exhausted.
This is just a super easy and fast way to turn your ideas into useful applications.
You can check out Gemini CLI on GitHub to get started.
All right, back to Jacob.
If you're right that transcription factors are the modality evolution has used to have complex phenotypic effects, optimize for different things, two part question.
One, why haven't pathogens, which have a strong interest in having complex phenotypic effects on your body,
also utilized the transcription factors as the way to fuck you over and steal your resources.
And two, we've been trying to design drugs for centuries.
Why aren't all the big drugs, the top-selling drugs, ones that just modulate transcription factors?
So the drugs we have can't target them.
But your claim is that a lot of drugs actually do work by binding to the things we actually can target and those having some effect on transcription factors.
So this brings us to questions about delivery, which is the next thing I want to ask you.
You mentioned lipid nanoparticles.
This is what the COVID vaccines were made of.
The ultimate question, if we're going to work on de-aging, is how do we make every single cell in the body, even if you identify what is the right transcription factor to de-age a cell, and even if they're shared across cell types or you figure out the right one for every single cell type, how do you get it to every single cell in the body?
Yeah.
How do you do this?
How do you deliver stuff?
You're trying to solve aging?
You think you have only one?
This is actually – I mean, in a way, we treat cancer this way with CAR T therapy, right?
We take the T cells out, and then we tell them, go find –
a cancer with this receptor and kill it.
But is the reason that works is that the cancer cells we're trying to target are also free-floating in the blood, and is that what it targets?
Basically, could this deliver to literally every single cell in the body?
Oh, interesting.
Yeah.
Interesting.
You know, it's funny that whenever we're trying to cure infectious diseases, we just have to deal with, fuck, viruses have been evolving for billions of years with our, you know, oldest common ancestor, and they know exactly what they're doing, and it's so hard.
And then whenever we're trying to do something else, we're like, fuck, the immune system has been evolving for billions of years, and it knows what it's doing, and how do we get past it?
Yeah.
Right.
Given the fact that it's somewhere between impossible and very far away, and it's necessary for full curing of aging...
Does that mean that in the short run, in the next few decades, we'll have some parts of our body which will have these amazing therapies and then other parts which will just be stuck the way they are?
So you mentioned hepatocytes are some of the cells that you're able to treat.
actually study and or deliver to and these are our liver cells so you're saying look i can get drunk as much as i want and it's not going to have an impact on my long-run liver health because then you'll just inject me with this therapy but for the rest of my body it's going to age as normal um what is the implication of the fact that the delivery seems to be lagging much behind your understanding at some point your understanding of aging
Is this related to why ozempic has so many downstream positive effects that seem even not totally related to its effects just on making you leaner?
How big will the payload have to be?
And is it, would it have to be a chronic treatment or could it just be a one-time dose?
Right.
So we've got 1,600 transcription factors in the human genome.
Is it worth looking at non-human TFs and seeing what effects they might have, or are they unlikely to be the right source base?
What about the effects of aging, which are, okay, so I don't know, your skin starts to sag because of the effects of gravity over the course of decades.
Is that a cellular process?
How would some cellular therapy deal with that?
So in a weird way, Arum's law is actually very similar to the scaling laws you have in ML, where you have this very consistent logarithmic relationship of you throw in more inputs and you get consistently diminishing outputs.
The difference, of course, is that this trend in ML has been used to raise exponentially more investment and to drive more hype towards AI.
Whereas in biotech, you know, modular, new limits, new round, it has driven down valuations, driven down excitement and energy exponentially.
With AI, at least you can sort of internalize the extra cost and the extra benefits because there's a general purpose model you're training.
So this year you spent $100 million training a model, next year a billion dollars, the year after that 10 billion.
But it's one general purpose model, unlike we made money on this drug and now we're going to use that money to invest in 10 different drugs in 10 different bespoke ways.
Okay, anyways, I was gearing up to ask you, what would a general purpose platform where even if you had diminishing returns, at least you can have this sort of like less bespoke way of designing drugs look like for biotech?
I'm not sure how to understand this claim that we know how to engage with the right hook.
We just don't know what that hook is supposed to do in the body.
I don't know if that's the way you'd describe it.
With another claim that I've seen that, you know, with small molecules, we have this Goldilocks problem where they had to be small enough to...
percolate through the body and through cell walls, etc., but big enough to interfere with protein-protein interactions that transcription factors might have or something.
So there it seems like getting the hook is the big problem.
Okay, so then what is the answer to what is the general purpose?
The general purpose model.
Every marginal discovery increases the odds you make the next discovery or something like that.
Right.
And you basically described one of the models you guys are working on at New Limit.
You're training this model based on this data where you're taking the entire transcriptome and just labeling it based on how old that cell actually is.
If you've got all this data you're collecting on how different perturbations are having different phenotypic effects on a cell,
Why only record whether that effect correlates with more or less aging?
Why can't you also label it with all the other effects that we might eventually care about and eventually get the full virtual cell?
Because that's a more general purpose model, right?
Not just the one that predicts whether a cell looks old or not.
This is so similar to in LLMs, you have first imitation learning with pre-training that builds a general purpose representation of the world.
And then you do RL about a particular objective in math or coding or whatever that you care about.
And you are describing an extremely similar procedure where first you just learn to predict perturbations in genes to broad effects on the cell.
And that's the sort of pre-training, just learn how cells work.
Mm-hmm.
And then there's another afterward layer of these value judgments of, okay, well, how would we have to perturb it to have effect X, which actually seems very similar to how do we get the base model to answer this math problem or answer this coding problem?
I don't know...
I don't know if people usually put it this way, but it actually just seems like an extremely, extremely, I mean, that makes me more optimistic on this because LLMs work, right?
And RL works.
You'll have a bunch of labelers in Nigeria clicking different pictures of cells, like, oh, this one looks young, this one looks old.
This one looks really great.
It's more like developmental biologists locked in a room, as my friend Cole Trapnell would say.
It seems like what you're describing seems quite similar to PerturbSeq.
And we've had PerturbSeq for, I don't know when it was done.
What year was it?
There were three papers almost simultaneously in 2016.
Okay.
So almost a decade.
I don't know.
We're still waiting, I guess, for the big breakthrough it's supposed to cause.
And this is the same procedure.
So why is this going to have an effect?
Why is this taking so long?
Yeah, good questions.
If it actually is the case that the... This is actually very similar to the way LLM dynamics work.
Then once this technology is mature and you get the GPT-3 equivalent of the virtual cell, what you would expect to happen is there's many different companies that have...
you know, are doing these cheap, perturbs like experiments and building their own virtual cells, or at least a couple.
And then they're like leasing this out to other people who then have their own ideas about, well, we want to see if we can come up with the labels for this particular thing we care about and test for that.
What seems like happening right now is, at least at New Limit, you're like, we know the end use case we're going after.
It would be like if Cursor or whatever in 2018 is like, we're going to build our own LLM from scratch so that we can enable our application, rather than some foundation model company being like, we don't care what you use it for.
We're going to build this.
Does that make sense?
It seems like you're combining two different layers of the stack, and it's just because nobody else is doing the other layer, and so you're just doing both of them.
I don't know to what extent this analogy maps on, but- Yeah, yeah.
Interesting.
And then this is more a question about the broader pharma industry rather than just New Limit, which is that in the future, how are people going to make money if you have...
You know, with the GLPs, we've got peptides from China that are just a gray market that people can easily consume.
And presumably with these future AI models, even if you have a patent on a molecule, maybe finding an isomorphic molecule or an isomorphic treatment is relatively easy.
If you do come up with these crazy treatments and a pharma in general is able to come up with these crazy treatments, will they be able to make money?
The reason I'm interested in this is that health care is already 20% of GDP.
I think it's grown notable percentages in the last few years.
This is a fraction that is quickly growing.
And most of this, I should have looked the numbers up, but the overwhelming majority of this is going to administering treatments that have already been invented, which is good, but nowhere near as good as spending this enormous sum of resources towards coming up with new treatments that in the future will improve the lives of people that will have these ailments administered.
I mean, one question is just, how do we make it so that more... Like, if we're going to spend 20% of GDP on healthcare, it should at least go towards, like, coming up with new treatments rather than just, like, paying nurses and doctors to keep administering stuff that kind of works now.
And two, if the cost of drugs ends up being, at least from the perspective of the payer, ends up being, you need a doctor to give you some scan before he can write you...
a prescription and then they need to administer it and they need to make sure that you're doing okay, et cetera, et cetera, then even if for you to manufacture this therapy might cost tens of dollars per patient,
For the healthcare system overall, it might be tens of thousands of dollars per patient.
Actually, I'm curious if you agree with those orders of magnitude.
Right.
So basically, even if we invent de-aging technology, or especially if we invent de-aging technology...
How should we think about the way it will net out in the fraction of GDP that we have to spend on health care?
Will that increase because now people just had to go, everybody's lining up at the doctor's office to get a prescription and you got to go into the clinic every week?
Or will that decrease because the other downstream ailments from aging aren't coming about?
I think the latter is much more likely to be the case.
Okay, final question.
So pharma is spending billions of dollars per new drug it comes up with.
And surely they have noticed that the lack of some general platform or some general model has made it more and more expensive and difficult to come up with new drugs.
And you say Perturbs Seek has existed since 2016.
And as far as you can tell, you have the most amount of that kind of data, which we could feed into a general proposed model.
So what is the traditional pharma industry on the other coast up to?
If I went to the head of R&D at Eli Lilly or Pfizer or something, do they think that this is like they have some different idea of the platform that needs to be built or they're like, no, we're all in on the bespoke game, bespoke for each drug?
Full disclosure, I am a small angel investor in New Libet now, but that did not influence the decision to have Jacob on.
This is super fascinating.
Thanks so much for coming on the podcast.
Awesome.
Thanks, Rakesh.
I hope you enjoyed this episode.
If you did, the most helpful thing you can do is just share it with other people who you think might enjoy it.
Send it to your friends, your group chats, Twitter, wherever else.
Just let the word go forth.
Other than that, super helpful if you can subscribe on YouTube and leave a five-star review on Apple Podcasts and Spotify.
Check out the sponsors in the description below.
If you want to sponsor a future episode, go to dwarkash.com slash advertise.
Thank you for tuning in.
I'll see you on the next one.
Today I'm interviewing Casey Hanmer.
Casey has worked on a bunch of cool things.
Caltech PhD on some gravitational wave, black hole gimmick stuff, then Hyperloop, then the Jet Propulsion Laboratory at NASA, and now he is founder and CEO of Terraform Industries.
Casey, welcome.
Big picture question I'm interested in, to the extent that AI just ends up being this big industrial race, who can build the most solar panels, who can build the most batteries, who can build the most GPUs and transmission lines and transformers and et cetera, et cetera.
This is not what the U.S.
is known for, at least in recent decades.
This is exactly what China is known for, right, where they have like 20x the amount of yearly solar manufacturing the U.S.
has.
Obviously, we have extra controls right now, but over time, SMIC will catch up to TSMC's leading edge.
So what is the story exactly of how the United States wins this?
Like, why does China just not win by default?
I think you're going to make these first principles argument about these other industries where they're killing it, but it doesn't seem to have hampered BYD or cattle.
They're devoting a lot to solar overcapacity, which, in your opinion, is the key to future industrial growth.
Accidentally correct.
They call the most important thing correct, right?
But you're working on this, right?
If you get synthetic fuels working at Terraform.
Personally, yes.
Doesn't that asymmetrically help China?
Which might be fine.
It does.
It absolutely asymmetrically helps China.
Just to spell out for the audience, if China has all this electricity production, and the bottleneck is that only a third of final energy use in a modern economy comes from electricity.
The rest, you need gas and whatever to transport things.
Or coal.
They use a lot of coal in China.
Right.
And what Casey is inventing is a technology to turn that electricity, which only can supply a third of end uses right now, into synthetic fuels, which can supply 100% of the electricity your civilization needs.
So then China's energy advantage then becomes overwhelming.
I don't know.
Whenever I agree that they've obviously made bad decisions, but even if you have the poorest Chinese people anywhere in the world, they can still be quite rich.
You know, like Singapore's rich or whatever.
Also, there's parts of China which actually contain quite rich Chinese people.
So you had to compare not all of China against the U.S., but Shanghai and Guangdong against the United States.
So you can have a part of China that is as big as America and as wealthy as America and as innovative as America.
But also, it's nowhere near as wealthy, whereas there are parts of China which are humongous, which are actually as wealthy as the United States, and in many cases, as innovative, et cetera.
So right now, we are export controlling chips for the purpose of we want to keep our AI lead or stay in the lead in AI.
And we recognize this is a key input in our ability to compete in AI.
So we are going to export control China's ability to have these chips.
Energy is also a key input in this AI race.
And if China wanted to do the converse of what we're doing to them with these cheap imports, what they would do to us is to export control solar and batteries.
If we decided we want to produce 100 gigawatts of solar capacity every single year.
Okay.
Is it going to be as cheap as it is to do in China?
How much additional...
solar power capacity do you think we could be putting on that's manufactured in the U.S.
by 2028?
I guess a lot of your predictions seem to be not predictions, but more like if we had World War II levels of motivation, if we had Manhattan-level project-level intensity around doing a specific thing, how fast could we do it?
Which is like if Elon was running the government, how fast could it happen?
So maybe then we should put it like, if Elon ran the government like he ran SpaceX, as opposed to the question of like, okay, what is actually practically likely to happen given that we are not treating it with World War II level intensity?
So you're a big solar bull.
Right now, the hyperscalers are making decisions about, with the data centers that they're building, they're going to be 1 gigawatt, 2 gigawatts, in Meta's case, 5 gigawatts, how they're going to be actually powered.
And the people with actual money on the line are choosing natural gas.
And it's not like they can't see the learning rate.
I mean, they're building things which will be online in 28 or 30 or something.
The reason that these current plans are being done based on natural gas is that this is a sort of like... Well, PGM has all kinds of different sources of power, right?
It seems inefficient to have redundant power plants at every single industrial... Let me paint a grand vision for you.
It's the cheapest way for them to get power.
Okay, but big picture question.
Across different kinds of ISOs, like from Texas to Pennsylvania to whatever, people are building data centers which will not be online for many years, and they're choosing natural gas.
What's going on?
And do you have some estimate of when we'll run out of... Because we can also make more turbines, right?
It's just inherently expensive to build.
Okay.
What is the cost of, so like GE makes these 100 megawatt gas turbines, right?
The marginal variable cost of like serving it, like in electricity is like less than 10% of the actual cost of- Well, their cost of serving it is like maybe a buck per million tokens or something like that.
Okay, so then why are we going to get the solar future?
Like in 2032, we're going to have hundreds of gigawatts of extra demand for data centers.
And at that point, most of it's coming from solar.
And why is that?
There aren't enough turbines being manufactured.
But also, I think in the early 2000s, we can probably overlay the graph of how many turbines are being manufactured.
Right now, we're at a historical... They've ramped up basically to the early 2000s rate again.
But I don't know.
You've got to make more solar panels as well, right?
There will be supply elasticities for both solar and natural gas.
Is there some reason to think that
It's worse for the supply chain involved in having a natural gas powered data center than a solar?
And what is the basis of...
Like, why are we finding 43% worth of things that can be made cheaper or more efficient every single year?
That could be true of any process, right?
But no other process sees the kinds of learning rates.
That's absolutely true.
That's all they were saying.
Wait, sorry.
The rate that it's accelerating is accelerating?
Yes.
As measured in the total fraction of energy that's coming from solar?
Backing up, the story is that the reason solar is getting cheaper is because there's a lot of demand for more solar, and that demand can sustain economies of scale or whatever is going on.
Yes.
I'm going to go ad lim here and agree with Elon Musk on this.
Then shouldn't that also be true of...
gas turbines and transformers and power stations and whatever else that's required for the non-solar future?
Because even there, we're expecting AI to drive up demand for power regardless of the source.
So to the extent the story for solar becoming cheaper over time is just that, well, demand will go off and that will drive efficiencies.
MARK MANDELMANN I think there's actually a similar discussion a couple of years back, or a year back, when AI people were like, no, AI is real.
This is going to happen.
And then SK Hynix, Samsung, et cetera, were like, we're not ramping up HBM production because, oh, HBM is used largely for AI workloads.
And if this demand doesn't continue, then our manufacturing
Additional manufacturing capacity for HBM will not have been worth it.
And there was another bottleneck with Coase.
What happened after that?
Maybe it's worth going into the numbers, right?
So right now, 43% of U.S.
data center power consumption is from natural gas.
And basically, you think asymptotically that will be like 100% solar?
If you go to like 2040.
And also the...
the amount of use is going to increase a bunch, right?
The amount of data center use of energy will just be exponentially higher.
So the new stock matters a lot as compared to the existing stocks.
Anyway, so I want to know, 2027, what fraction is natural gas?
2030, what fraction is natural gas versus solar?
MARK MANDELMANN- For new load or for the whole?
MARK MANDELMANN- Let's say new load.
MARK MANDELMANN- For new load?
2035, et cetera.
Basically, like, okay, if eventually you're right that we'll pave the earth and solar panels to sustain our quadrillions of AI souls, what is the pace of that?
I'll just use some numbers that AI 2027 used for their compute forecast, which even if you don't buy their singularity thing, I think they did a reasonably good job with crunching the numbers on their compute forecast.
And I think they said there's on the order of 10 million H100 equivalents in the world today.
And I think they said by 2028, there'd be 100 million.
Mm-hmm.
So basically 10x more H100 equivalents in the world.
About a kilowatt each, something like that?
Yeah.
Yeah.
Right.
Going in as in like new... As in groundbreaking at that point.
But if you're groundbreaking in 27, you're probably like planning it now, right?
Oh, that's why they're calling me.
Right, okay.
So let's get into what this... If you've got a five gigawatt plant you want to build.
Yes.
Break down the numbers for me in how much land in terms of solar you need to farm this out.
And especially, I was talking to...
somebody in this space and they said, look, the big problem is not obviously cost for the cost of energy for these data centers is a small fraction of the total cost.
Most of the cost is going towards chips.
So then the issue is just can you can you make the energy available?
And they were saying even those solar panels themselves you can acquire.
The issue is getting that much contiguous land and getting the permitting to
interconnected or whatever the word is, is like apparently a big hassle.
Yep.
Yeah.
And so they're like, well, at that point, is it actually easier than just getting on the grid?
But yeah, if you need tens of thousands of acres of solar, where can you do that and get like... Basically in Texas.
Really?
I mean, it is because energy is a small fraction of the cost.
You care more about making sure the chips are running all the time, right?
Brian Potter had a good analogy in his blog post about this where he's like, I don't know, my MacBook has a terabyte of storage and I use 100 gigabytes.
And I just got the terabyte version because it's cheap enough and I might need it at some point that it's worth it.
And so you're seeing solar gets so cheap that it's the way we'll treat hard drive space.
There's, like, get a bunch of excess.
I guess you didn't answer the question of, yes, theoretically we could do this, but is it going to be possible to get the permitting to have tens of thousands of acres of contiguous land?
I've played Factorio.
I remember this optimal layout of batteries and solar.
In terms of the ratios, it's one trend that was impressed upon me is that the power density of racks is increasing a lot.
Yes.
As the flops for GPU are increasing.
I think it was even more than that.
But yeah.
And that's- For how big of a data center?
It does sound like a lot.
Was Hanford?
Okay.
I don't know how big that was.
Okay.
I mean, is it like, oh, this is so small.
And then you're like, oh, but it's 100,000 acres.
Austin Vernon had an interesting blog post where he said that if you allow for, if you have like diesel generators or something, which can take over the generation for like 10% of the generation during these, you know, with a winter or something.
Yeah, yeah.
then you can have a 60% reduction in the amount of solar panels you need to install because you don't need to plan for that contingency.
The issue here is, look, if Meta or Microsoft or whoever just wants to get something off the ground, this might be...
low OPEX to have this huge solar farm, but high CAPEX, where you need to hire like 30,000 people to go out in the middle of a desert and install 50,000 acres worth of solar panels.
And they're like, why would I not just buy like 50 gas turbines instead?
Right.
I don't know if my numbers will be wrong, but like 250 billion or something.
That's so much money.
I did the math in my head.
If you're running a frontier technology company, you know how essential it is to recruit the world's best talent.
But this requires navigating the Byzantine US immigration system.
Not only do you not have the time to deal with this yourself, you just don't have the tacit knowledge to maximize the probability of success.
But given how critical exceptional talent is, you can't take any risks with visa approval.
You need to work with the best.
Lighthouse handles everything better and faster than you possibly could, and they do it all with minimal input from you.
Lighthouse knows the nooks and crannies of the immigration machine.
They know how to frame publications and awards, or how to structure comp comparisons versus benchmarks, and how to put everything together into the most compelling story possible.
And they've even optimized the tiny details, like the tone they use when they draft support letters to help U.S.
immigration officials understand the importance of tech and startups.
Companies like Cursor, Together AI, and Physical Intelligence are all already working with Lighthouse.
You can join them by visiting lighthousehq.com slash employers.
All right, back to Casey.
Between the fact that the, um...
Maybe solar prices will go down and the fact that demand is going to go up.
Do you think electricity prices are likely to rise?
Yes.
We should protect it.
I mean, I think people will be familiar with NEPA and whatever, but like, how is it especially impacting solar?
People will point out that transmission line growth has been stuck in a rut for decades and we have all these bottlenecks in terms of substations and transformers, et cetera, et cetera.
Why will this not hamper this abundant solar future?
I think there's another blog post you wrote, though, which was also to our conversation we had, which was how to feed the AIs.
The difference is that...
I mean, it's especially helpful for solar, but like solar is the one that's most intermittent.
You can predict the amount of solar power you're going to get in three days pretty accurately because of weather prediction.
But you can't like buy more batteries.
You can't like change the amount of batteries you have.
Okay, so let's assume you're right.
And then, I mean, I think at some point you will be right.
Like maybe we disagree about, sorry, I'm not qualified to disagree.
Maybe you and some other person might disagree about what year it happens.
But I think it's hard to deny that in the asymptote, our civilization is headed towards lots of energy use for AI and a lot of that coming from solar.
Mm-hmm.
In that asymptote, I want to get the crazy nerd sci-fi, like, what does our civilization look like?
What is happening in this, you know?
Khodashev level one.
Yeah.
Let's wait till we get to turning the entire Earth into an AI factory, but more like, I don't know, the 2030s, where you've got multiple people who are building...
On the order of 5 gigawatt or 10 gigawatt sites, the value of the hardware is dependent on its complement, which is the software, right?
Like right now, AI models are fine.
And so the hardware they're running on, the economic value they can generate is sort of bottlenecked by how good the software is.
But if you actually had AGI, if you had like a human level intelligence or maybe even better.
Ideally better, yeah.
Yeah.
Running on an H100.
That H100 is worth a lot, right?
Like we're paying a lot for humans to do work.
Right now, I don't think AI is that valuable.
Like the models themselves aren't super, super valuable in terms of just pure economic value, right?
Open AI is generating on the order of 10 billion ARR or 20 billion ARR.
That sucks.
It's terrible.
How can they sleep at night?
I know.
But for context, McDonald's and Kohl's generate more.
yearly revenue than that.
But I think the promise of AGI is to automate human labor.
Human labor generates on the order of $60 trillion of economic value, or that's how much is paid out in wages to labor around the world, right?
So that's what AGI can do.
And even if you curtail it to just white collar work, there's still tens of trillions of dollars of value.
So once we have models which are actually human level, they will be worth at least that pending the fact that you can build them.
Yeah, that's interesting.
It's also really interesting, something that came up, and I want to credit James Bradbury and Gwern with making this interesting point when I was talking with them a couple of days ago is,
If you measure it by GDP, AI's outputs might be underwhelming, right?
One of the complaints that economists have about the internet is that it's hard to measure the consumer surplus that's created by the internet because a lot of the goods and services that are made available
You pay zero for them, and so they don't show up in GDP.
Well, it's the same with oil.
Yeah, in that sense that it's only 1% of energy is like 1% of GDP.
And also, its fraction of GDP also doesn't correspond to how important it is.
For example, oil is like 1% of GDP or something.
But if you don't have oil, then you have these oil shocks, which cause double-digit decreases in GDP.
So the elasticity of demand often matters more than its raw fraction contribution to GDP.
But anyways, on the original point about AI, so...
you're going to have this huge deflation of... So Gordon put it this way.
He's like, if you imagine Dario's data center of geniuses, how is that showing up in GDP?
Well, it would be the inputs, which are the chips, the energy, et cetera, and the outputs, which are just the tokens.
And neither of those is going to be that astronomical in comparison to the value that data center of geniuses is producing.
So in terms of GDP numbers, like...
if that data center of geniuses automates a bunch of, or at least complements a bunch of human work, et cetera, it might actually cause like a nominal decrease in GDP while at the same time contributing massively to what we might think of as the valuable stuff humanity or human civilization can produce.
And so in the long run, it might make more sense to think of the size of our economy or the size of our civilization as the raw energy use
that we do rather than GDP.
Because again, GDP will see this huge deflation because the variable cost of running AI will just be pretty cheap as compared to like paying humans wages, et cetera.
an AI doing my job and also a human doing my job... I love how this is the new way we use the phrase mixed economy.
Yeah, exactly.
If we think that the value of cognition is going to be unbounded and the way to derive cognition, you can just... To the extent you think solar will eventually win, you can just...
You can derive it from how much land it takes to power an H100 using solar panels.
That is a very interesting derivation that like, okay, well, this is at a minimum, we're going to just fill up all the land.
I mean, at some point you might have like declining marginal value of cognition or something, but... We kind of discussed this earlier, but if you have like a megawatt of...
At current hardware efficiencies...
I don't know if it's worth spelling out.
Basically, H100 has the same amount of flops as a human brain, but also uses way more energy than a human brain.
Yes.
It uses like 50x more energy.
So 20 watts versus 1,000 watts.
Human brain.
We know hardware can be at least as efficient as the human brain.
And the human brain can generate this many flops on 20 watts.
So if you do that calculation, then that's 50x 1,000.
So 50,000 AI souls off of
Let's actually go back to the original point of like, I was explaining why I think it's plausible that there could be more than hundreds of gigawatts of extra demand from AI in the 2030s.
I want to understand what that looks like in the real world.
Like at that point has become basically this industrial problem of can you generate enough solar panels and solar modules and batteries and not to mention the chips themselves.
Let's start with the industrial point.
I want to know what the year 2035 looks like if we've got AGI and we're just bottlenecked by the ability to deploy it.
This is a great prompt for a sci-fi exercise, because especially in space, you don't need batteries, so you're just like...
The prompt is the future TSMC just manufactures integrated solar dies.
And they can fly around, right?
Is this what the Dyson Cerulean will be, Casey?
Is this going to be competition at the center of solar cells?
We will eventually be- A solar sail with a silicon die in the middle for compute?
Yeah, that's assuming a little bit of software improvement, but I don't think that's- All that's assuming is software improvement.
The Dyson sphere, that's all it needs, a little bit of software, a little bit of tweaking of the algorithms.
Any thoughts?
Like from dirt as you mean like actual dirt?
I mean, the reason I think this is interesting is because whenever people who are talking about the AI singularity, often their expertise is not in energy or physics or whatever.
So they focus only on the cognitive elements of the singularity, which is like how much faster can we make AI smarter, etc.
I think this is really interesting because...
If we have unbounded cognition, which sets up both the ability to supply and to demand more energy, I'm very curious, like, what does the energy singularity look like?
Where, you know, we're just trying to saturate as much energy that the Earth receives and turn it into cognition.
That is an interesting concept that for 4 billion years, we've been increasing the sort of like variance and complexity of...
And also, the CEO is not having an affair.
I think that was a more crucial issue, Casey.
Casey, thank you so much for coming on the podcast.
Oh, thank you.
It was fun.
Yeah.
Thank you for tuning in.
I'll see you on the next one.
Today, I'm chatting with Louis Ballard, who is Farm Animal Welfare Program Director at Open Philanthropy.
And Open Philanthropy is the biggest charity in this animal welfare space.
So, Louis, thanks so much for coming on the podcast.
Thanks for having me on.
Okay, first question.
At some point, we'll have AGI.
How do you just think about the problem you're trying to solve?
Are you trying to make conditions more tolerable for the next 10 years until AI solves this problem for us?
Or is there some reason to think that...
the interventions we're making in terms of improvements like inovosexing or cage-free eggs, et cetera, will have an impact beyond this transformative moment.
Maybe an intuition pump here is we've been spending on the order of hundreds of billions of dollars a year in order to replicate human intelligence.
And human intelligence has been developed... I don't know.
It depends on when you start counting.
Intelligence has started evolving.
But on the order of hundreds to tens of millions of years ago, evolution has been trying to optimize for this intelligence thing.
And...
We've had to spend all this effort in order to replicate it.
Converting calories into meat has been something that evolution has been optimizing for billions of years, right?
So everything from the immune system to growth factors to delivering nutrition, et cetera, texture, whatever.
This is like, this is what evolution is working on the entire time.
So it makes sense why this is actually such a tough problem.
Are you ready to throw some cold water on your friends?
Yeah.
How far away is cultivated meat actually?
But I mean, eventually we'll have like nanotech or whatever, right?
At that point, raising chickens can't be the thing to do.
So the reason I think this is a very interesting example is because whenever people think about the use of technology to improve animal welfare, they're thinking about cultivated meat, lab meats.
They're thinking about these extremely far off solutions.
And then it makes sense why even people who are especially concerned about the space, the first thought is not to just find ways to make the existing regime more tolerable, but to come up with some moonshot that changes the whole paradigm.
If you look at how much VC investment is going towards cultivated meat, I don't know if you know, but...
Probably on the, yeah, do you have some sense of how much it goes into a year versus how much VC investment goes into, okay, we've already got the farms.
What is it that we need to do to make, come up with more things?
Like let's put the eggs through MRIs.
Let's do these other small improvements in welfare.
Right.
Which has probably been motivated...
at least partly, by the sense of we're going to make things more ethical.
And people might not realize that in the near term, to actually make things more ethical, it might be just better to increase that 10 million pool.
MARK BLYTH I think it's good to do both.
So whenever a discussion like this comes up, it's often phrased in the context of personal behavior.
I think people will be assuming that what we're going to get up to is this push to make you vegetarian.
And I happen to have been vegetarian.
I grew up a Hindu, and so I've never eaten meat.
And then I just stayed a vegetarian after I was no longer a Hindu.
But then I started prepping to interview you, and I'm like, fuck, this might—
I don't know how valuable this is, especially if you look at some of these online charity evaluators and you're just like, a dollar of your donation will offset this much meat eating.
And you're like, what are we doing here?
But anyways, vegetarianism overrated.
So how did we end up in this position where so much, I think when people think about animal welfare, they think PETA, they think of like protests which are encouraging individuals to give up meat consumption.
At the same time, these charities which are so effective at corporate or policy change are just like so neglected.
How did this end up being the landscape of animal welfare activism and funding?
Okay, so this is why I really wanted to do this episode, which is I think people will be aware that there's a general problem here, but the actual politics and the actual economics, the actual state of the technology landscape here, there might be interventions which are stupendously effective, which would overlook just because people are not aware of what's actually happening in this space.
So on that point, to use an analogy from global health and poverty,
The Against Malaria Foundation estimates that it saved on the order of 180,000 lives or something, which is a lot.
But then you compare it to China liberalizing brought a billion people out of poverty.
That just like many, many orders of magnitude bigger impact.
In animal welfare, do you have some like big take about what the China liberalizing equivalent in this space is?
Okay, I want to go into Innovo sexing.
Yeah, just the fact that you can have a new technology and you can have basically proto-improvements where things aren't getting more expensive.
Maybe in the future, they'll actually get cheaper because of their technology.
At the same time, you're having improvements in animal welfare.
The problem, of course, with this industry has been that in the past, increases in efficiency have been coupled with increases in cruelty.
So I want to understand whenever the trend goes in the opposite direction, what causes that to be the case?
So what is the history of this technology?
How does it work?
Why did it take so long for it to come into common practice?
And how much was this driven by policy versus the tech being mature enough for it to be economical?
I'd be curious to understand exactly, because MRIs have existed for a while, PCRs have existed for a while, so why it took this long for this to be economical?
What the nature of that cost curve was?
And I'm especially interested to understand this because
It seems to imply that, look, I mean, we didn't have to come up with some brand new tech in order to enable this.
So are there other things where somebody who is somewhat familiar with the technological landscape, people are always looking for startup ideas, right?
Should they just spend a couple of days at a big poultry farm or pig farm or something and see if things can't be improved?
Yeah.
I think one important dynamic to this industry that you pointed out is that whenever we have to really optimize for efficiency in one domain, it causes all kinds of other problems that we have to then make up for with even more cruelty.
Could we make...
chickens or pigs with no brains, right?
Because there's a suffering we care about.
So to the extent that their bodies are just these incredibly well-evolved bioreactors for converting grain into meat, whereas optimization has led to more and more cruelty in the past, in this case, this is the ultimate optimization, right?
They're not moving around at all.
They are literally just a machine for producing more meat.
Yeah, and then the suffering is in some sense inefficient, right?
Like it causes them to, if they're pegging at other animals, if they're getting caught in wires, et cetera, this is something that like, it would be better even economically to eliminate.
Why not think that they will just be eaten up in terms of their welfare impact?
To the extent that the economics in the industry for a century have been cram more things in, you know, figure out how to optimize along axes which just make the animal incredibly unhealthy and immiserated for longer and in more extreme ways.
Like, okay, we'll come up with an Innova sexing, but then there will be another thing which is the equivalent of gestation crates.
Why think that...
Even technologically, the thing that is favored is the suffering-free optimizations.
So potentially we could find ways to make animals even bigger with the future forms of biological progress that some of my guests talk about.
It's already the case that it's better to eat beef than chicken because cows just have so much more meat per brain.
What if we just got rid of the myostatin inhibitor genes or whatever, and then now there's even more meat per cow?
Is that better because you have more meat per cow, or is it worse because it's potentially going to lead to the same dynamic of these overgrown, more suffering animals?
Which way does that tilt?
But the consumption might have gone up regardless.
So actually then it's not clear.
They would have to be, to the extent that we hold consumption constant, maybe we shouldn't, they would have to be suffering 4x as much as a chicken in the 1950s for it to not be a net improvement.
I don't know if you disagree with that.
It's just striking me now that the way to think about what we're doing to these animals is not even, and this is already be just incredibly immoral, is finding creatures in the wild and then caging them up and then putting them through awful tortures.
Rather, we are manufacturing creatures
basically optimized for suffering, right?
It's not even that like, we found this chicken and now we're going to put this in this like little cage.
It's like, we have designed this chicken to basically suffer as much as possible.
We have like literally genetically changed it as much as we can plausibly change it given the technology available to us today in like this, in this Frankensteinian way to suffer as much as possible.
I don't know.
That framing just makes it like,
Yeah, especially gruesome.
In just reading about the accounts of, for example, pigs in gestation crates and the medical symptoms, you know, like swollen ankles, broken bones, obviously from chewing the iron bars, all the bruises that causes, ulcers, tumors, cancers, pusses, etc.,
These not being rare medical emergencies, but the regular anticipated expected outcomes across populations of pigs, which individual farms will house like...
thousands of them, and of course, around the world, a billion.
I'm sure you've visited many of these places yourself or have friends who've done so, right?
Right.
And this is an issue where a scope sensitivity is just – it is like so insane the magnitude, right?
If this one –
battery cage farm was the only thing that existed in the world, right?
There was like this one farm in India that had 100,000 chickens, which were each just experiencing weeks upon weeks of pain through their life.
That would already be a moral emergency, but it just, it's so easy to forget that if there's 10 billion chickens that are alive at any point in the world, the whole problem is five orders magnitude of magnitude bigger than this one farm itself, right?
So 100,000 times bigger than this one farm.
It's just like stupendous to comprehend the scale of the problem.
It's crazy.
MARK MANDELMANN Right.
Yeah.
OK.
So the positive spin on that can be that because of how big the problem is and how neglected it is, the ability of any one person to have a big impact
might genuinely shock them.
So let's get into that.
You are the biggest funder in this space, but cumulatively between you and the others, what is the amount of smart money that is being allocated to this problem?
What would happen if the amount of funding in the space doubled from the 200, 300 million you mentioned that is being spent smartly?
I know you will say there's a bunch of things we could optimize around, right?
There's so many neglected issues.
But is there an immediate thing where you're like, this is the thing that is directly at the margin?
The next 100 million or the next 10 million would enable this?
I loved this specific example of like there's a super tractable thing that is like immediately available with the next millions of dollars in funding.
Is there a particular charity which works on these campaigns in particular?
I think people just might not be aware of the ratio of dollars to suffering averted in this space.
Yeah, if you can give some sense of what we're talking about, dollar to suffering here.
So $1, you're saying $1 can do more than 10 years of a better...
a more humane life.
That is stupendous, right?
A couple hours of pain is just awful and terrible.
You're saying 10 years for a dollar.
The reason why that's so shocking is that
On its face, it's shocking.
But in other areas where you're trying to do global health or something, first, the problem is improving on its own.
Second, the Gates Foundation, et cetera, there's tens of billions of dollars already being poured into the problem.
Same with climate change, et cetera.
So the idea that you would find an intervention that – like a single dollar can go this far is just –
It is just genuinely crazy.
So I won't bury the lead any longer.
I've always been interested in this issue.
I lost track of it for a little while, to be honest, but I encountered you on Twitter and I started learning more about the issue.
We chatted a few times in person and that motivated me to have you on the podcast.
And also to donate myself.
So as you mentioned, Farmkind Giving is this re-granter.
They don't keep any of the money themselves.
They just re-grant it to the most effective charities in this area.
They're basically like an index fund across the most effective charities in animal welfare.
And it motivated me to donate to them.
So I'm giving $250,000 and I'm doing this as a donation match.
So this is to say that you, the listener, if you contribute to this donation match, we can double each other's impacts.
And between the two of us, we can allocate $500,000, if we saturate this, and I really wanted to saturate this, $500,000 to the most effective charities in this area.
And remember how neglected this area is.
Lewis, as you were just mentioning,
One dollar that is donated in this area corresponds to 10 years of animal suffering that is averted, which is just stupendous to think about.
There's no other cause area in the world which has such a crazy ratio.
And that has to do with how neglected this area is.
And of course, the positive connotation of that neglectedness is just how big an impact
any person listening to this podcast can have.
So that's a donation match.
And the way you can contribute to it is to go to farmkind.giving slash dwarkash.
Now, I also recognize that there's people in the audience who can do much more than this amount.
And given how neglected this issue is, right?
Like, remember, there's only the order of 100 million or 200 million that are being spent wisely on this topic.
One such person listening could double the amount of money that is being spent effectively in this area.
That's crazy to think about, right?
And if you are one such person, just think about that.
And even if you can't double the amount of money that's being spent in the area, you could cause a double digit increase in the amount of funding that these effective causes are receiving.
So for those people in a position to contribute much more, or at least want to get their foot in the door and explore contributions of 50K or higher, Lewis, what's the best way they can reach you?
Okay, but if you're like the rest of us and you need to start off on a more humble basis, I think your donation would already just have a huge impact given how neglected the space is.
So again, the link is farmkind.giving.com.
Okay.
So let's talk about other countries because you are not only the biggest funder in this cause area in the United States, but globally.
And then obviously an animal suffering in Sri Lanka or China is just as bad as an animal suffering here.
So what is especially promising, especially given that more people in these countries will start eating meat and
This problem is getting worse over time.
It's getting worse because people are getting wealthier and eating more meat.
What seems like the most useful intervention or the useful thing to understand about
What to do about that?
But factory farming spread because it was cheaper, not because there was some law passed that everybody else felt the need to copy.
That's right.
That's right.
On net, is there a Kudnets curve here where initially they get wealthier, wealthy enough to afford the most economical forms of meat, which are battery cages, etc.
And then they get even wealthier so that they can afford the potentially slightly more expensive versions of meat, which are more humane.
Or on net, it's just like...
you keep eating more meat through this whole process.
So even if it gets slightly more ethical, the amount of meat consumption will have 2x or 3x.
So wealth always correlates with more suffering, basically.
So a difficulty that these animal welfare policies have had is...
Even if you outlaw a practice domestically, to the extent that it's cheaper to produce meat that way, people will just import meat produced that way that is made elsewhere.
And so states in the U.S.
who have tried to do this have had this problem.
Countries in Europe that have tried to do this have had this problem.
How do you solve the lowest common denominator problem in animal welfare standards?
Right.
And potentially reversed by...
Okay.
So if these advocates are able to pass these laws or ballots at the state level, and it's popular enough that they're passing, why is it at the national level they can't make a ruckus about this and prevent this from getting added to the full farm bill?
But shouldn't there be...
some political constituency that's formed by the pork producers who are using more ethical standards and who are themselves being undercut by these Iowa farmers, why aren't they getting flown out to these congressional hearings?
That's exactly right.
But the meat lobbyist also, given that it's a commodity business, you would think that there wouldn't be that much surplus that they can dedicate to political lobbying.
Yeah.
So everybody here is like not doused in cash.
We can't subsidize a couple plane tickets for these family farmers?
Like what's going on?
That might be good for animal welfare in the sense that if they can extract greater surplus, it makes it more possible for them to potentially invest in animal welfare.
Not that they're necessarily doing it, but it would make it possible.
Completely makes it possible.
That's right.
But is it on the order of hundreds of millions, billions?
So if the majority of House members have written this letter apparently saying that this should be taken out of the Farm Bill, why is it still, like, they're the people who are going to vote on this, right?
So why is it still going to pass?
Okay.
I want your guide on how to corrupt the political process in the opposite direction.
What insights do you have on how to actually have an impact on how congresspeople or state legislatures vote?
Yeah.
Right.
That's good.
Yeah.
Ease the foot in the mouth I caused by saying the word corrupt.
That's right.
And what would it actually take to... I don't know, I hear this and I'm like, it's not clear what exactly you would do if you wanted to...
Get this message in front of, abstractly you can give money or whatever, but how does that actually transfer to political influence?
You wrote in one of your recent blog posts that the meat lobby spends on the order of $45 million in any given election cycle.
And they seem to be able to have influence on the topics they care about, which would be astounding and make jealous all of us in tech.
There are probably people listening to this podcast who could spend on the order of that kind of money on politics.
But the ability of tech to have an impact on the kinds of issues that they care about is quite minuscule compared to the meat industry.
So what's going on here?
What's the political economy?
of meat here.
You're telling me that tech bros aren't as politically sympathetic as the assault of the other farmer?
The front end developers have yet to grace the covers.
That's right.
So you should describe the sort of franchise hierarchy type structure of a lot of these meat companies.
But you would anticipate that, yes, the Purdue's and Tyson's of the world would want a particular thing to happen in terms of political processes.
But the farmers who are indebted to these companies often have an adversarial relationship.
Why are they able to form an effective political coalition with them?
And what is the reason that these contract farmers are willing to work with these large businesses?
Because people will often say things like, oh, Uber is bad for Uber drivers.
And I'm just like, I trust Uber drivers to know what's best for them.
Why would these small farmers be working with these companies in the first place if it's uneconomical for them?
And what is the alternative use of that land?
So if you didn't work with some centralized processor, is the alternative use of that land for farming?
If you've inherited some land and you want to figure out what to do with it, what can you do with it?
Why is that?
Because there must be enough consumers who, even if it's not a majority of consumers, there must be enough that there's some economic incentive to set up the economies of scale and supply chains that would make it easier to set up such a farm, right?
So why doesn't that exist?
Yeah.
I guess I was wondering, so if you... There's, like, normal bananas and there's organic bananas, and people are willing to pay quite a bit more for organic bananas.
I feel like Pasture Race should be in a similar embedding space as, like, organic, where, like...
Organics is a huge industry, even though it has dubious medical benefits, etc.
So then the problem is not that if there was accurate labeling, you think there might be consumer demand to make this a viable, much larger industry, just that it's very hard for consumers to identify which is which.
Okay.
So this is one thing I wanted to ask you about.
One point you've often made is you have to understand that meat and agriculture generally is a commodity business.
In a commodity business, you'd expect all margins to be computed away.
I think it's in one of your blog posts that for a dozen eggs, it costs 19 cents more to have them be cage-free.
But
often chains will charge on the order of $1.70 more for cage-free eggs.
So if it's a commodity business, why is it possible for supermarkets to extract this extra margin?
Wow.
Interesting.
If these companies are already making these commitments, in many cases following through on them, to move towards more ethical ways of procuring meat, procuring eggs, etc.
I think I learned from you that McDonald's has made these commitments or that Chipotle has made these commitments.
I didn't learn from McDonald's.
What is the reason that this is not a more prominent part of their own advertising, given how much consumers...
how universally popular animal welfare is.
So the very best companies are advertising this.
It seems like given how fast you're able to secure these commitments from different corporations, from retailers to restaurants, et cetera, it seems like corporate campaigns are even more successful than policy.
It's like corporations are much more receptive.
I mean, there's obviously, I don't know, Purdue and Tyson are corporations as well, but...
The rest of the actual industry of getting food to consumers just seems incredibly receptive to these kinds of pressure campaigns.
And maybe that's a lever of change that's especially salient?
MARK BLYTH, JR.
: Wait, so what is the reason that the animal welfare movement has gotten so wrapped up with?
You go to most landing pages for animal welfare stuff, and it'll be like, we're improving animal lives, and we're making farming more sustainable.
We're addressing climate change.
And that just seems really strange to me, like, OK, we're torturing tens of billions of animals a year, but then also we're reducing emissions.
Like, we'll figure out some other way to reduce emissions, right?
Like, how did this become the same issue in the first place?
But then why do animal charities... It's not like just a cynical attempt by the meat industry.
If you go to animal charity websites, they'll often also emphasize sustainability on their landing page.
And I understand other people's psychologies are different, so I don't want to project the way I think about it on...
At least whenever I see that, I'm like, oh wait, are you actually optimizing for the thing that makes this a really salient issue for me?
Or are you just going to optimize for carbon footprint rather than this incredible amount of suffering that this industry produces?
So yeah, why are they doing this?
Why have your friends roped in sustainability into this area?
Lewis, thank you so much for coming on the podcast.
And also, thank you for the work you do.
You are allocating the largest amount of philanthropic funding in this space.
And I'm sure, look, you're a cheery fellow, but I'm sure day in and day out, this is not pleasant work to do to learn about these gruesome details and how we can make the situation better.
But it's awesome that you're doing it.
So thank you for coming on and thank you for your work.
Cool.
YouTube Shorts have been a really great tool for us for bringing new viewers to our main interviews.
For example, if a guest is talking about a historical event, we'll take a snippet of the conversation and then just use the real archival footage to bring it to life.
But when they dive into hypothetical situations, that can be hard to visualize.
For example, I recently interviewed Stephen Kotkin about Joseph Stalin, and Kotkin started this hypothetical dialogue between two Soviet-era co-conspirators.
We wanted to feel like you were really there in the room while this was happening.
So we generated a picture of a Soviet official and we prompted Google's VO3 to have him say Calkin's line.
This Stalin guy.
He's wrecking everything.
This was literally a one shot and it handled everything perfectly, including the pause.
So we edited the video in and lip synced it to Calkin's line.
Try VO3 on the Gemini app with Google's AI Pro plan, or get the highest access with the Ultra plan.
Sign up at gemini.google.
Okay, back to Sarah.
I loved this lecture especially, and this topic especially, because it's about a conflict that many of us weren't educated on, but adds context to the other ones, which are obviously super famous, like World War II, adds context to the development of these powers, which are the most powerful in Asia.
So I'm not going to keep wasting time.
I just want to jump into asking you a bunch of questions about this.
Okay, one way you can explain Russia's loss here is that Japan had better tactics.
And so given the fact that they had deployed similar amounts of men to the front, Japan with their better tactics was able to win in these different battles.
But maybe the more important thing to explain is why Russia didn't deploy more men or didn't deploy its greater resources.
Russia's population is over 130 million people at this point.
Japan's is 47 million or so.
Well, you might say it's the Trans-Siberian Railway isn't completed by this point.
But in 1905, it actually is complete.
And so Russia...
could have continued the war.
It has all these fresh troops it could send.
So then the question is, why didn't they have the same will to, or the same ability to muster resources, their massive resources that Japan brought to bear?
So I guess you're saying Russia doesn't care about this conflict as much as... Russians.
So then he doesn't have the state capacity to mobilize his army as much?
Oh, that's the whole problem with Russia.
I want to understand World War I, they are able to mobilize many more men.
Obviously, there was incompetence in World War I as well, but the sheer amount of resources that the Tsar was able to bring to bear in World War I, they brought to the Far East during this conflict.
Once the Trans-Siberian Railway is finished in 1905,
It's not completed until World War I. I'm excited to announce my new sponsor, Hudson River Trading.
They're a quant firm that accounts for around 15% of all U.S.
equities trading volume.
And almost all of this trading goes through their deep learning models.
But predicting market movements is just insanely hard.
You have to deal with imperfect information, all kinds of heterogeneous data with minuscule signal-to-noise ratios and impossibly tight latency constraints.
And if this weren't enough, you're in this arms race against some of the smartest people in the world who are continuously training their models against the data that you output.
Because of this, HRT has built their internal AI team like a frontier lab.
They've aggregated trillions of tokens of market data across every asset class and invested in a massive cluster of B200s.
And they've built up an exceptional team of researchers to do fundamental ML research.
Honestly, in terms of AI, the big differences between them and a frontier lab is that, well, one, HRT is very profitable, but two, their engineers and researchers not only have to make their models performant at scale, but also have to battle test them in this insanely adversarial environment.
If you want to join them, you can learn more at hudsonrivertrading.com.
Okay, back to Sarah.
I want to double click on what part of the Meiji Reformation exactly helped Japan
mobilized for this war?
Because, okay, the population is literate.
You have institutions for having political say, but like how exactly does that help you have more people go to the front?
One really interesting takeaway from this lecture, and I'm curious if you agree with this, is when we think about this period in history, we often think of Japan as the rising power in Asia.
Yeah.
But it seems like the takeaway here is actually Russia was on the path to be the dominant power here.
Their population is so much bigger.
If the efforts to modernize worked, they would just have a 4x bigger population.
But then they could have had the modern industry Japan has.
And so we're not later on, obviously, for...
the Bolshevik Revolution and the Civil War and collectivization, Russia would be dominant.
And Japan seeing this acts early to... So I guess I didn't put it in those terms before I heard this lecture.
Would you agree with that?
Let me take another different stab at your thesis.
Go for it.
So it's not Japan's strategic brilliance, which mainly explains why they won the war, but rather the fact that Russia could have kept going even after the Baltic fleet is sunk, even after they've lost in Mukden.
But what changes is in 1905, you have this massive peasant revolution.
You have mass strikes in the cities.
You have Bloody Sunday in January.
So the Tsar's government...
almost collapses to the point that he has to at least give the perception of instituting major reforms.
This is maybe a tangent, but when I had Stephen Kotkin, the Stalin biographer, on... Oh, you might know more about it.
Japan cannot be claimed credit for the fact that there was these long-standing issues in Russia, which almost brought down the government in 1905.
This was a lucky break that they got, which forced Russia and the Tsar to...
reallocate their efforts from the fight.
And to be clear, I'm not sure if that's... I don't want to put words in Kotkin's mouth, so I'll just put them in my mouth.
I won't blame him for any errors I make.
Okay, another thing that I think is super interesting in this period is it potentially helps explain why Japan thought that Pearl Harbor might work.
Because superficially, these two situations seem similar.
In both cases, you begin with...
This imminent threat that an adversary of yours is going to gain massive military leverage over you.
So in the 1905 Cold War with Russia, it's that the Trans-Siberian Railroad finished.
In World War II, it's the worry that after Japan invades French Indochina, America does an oil embargo on Japan that gives them one to two years of oil runway before their entire empire dies.
collapses and grinds to a halt without oil.
So there's this worry that they have to act now, otherwise they're about to lose their leverage.
And then there's this initial surprise attack, so Port Arthur in 1904, and then obviously Pearl Harbor.
And even if you look at other battles, like Midway and Mukden, we'll have this sort of annihilating battle where we will expend our blood and just push them back.
Curious to get your take on this interpretation.
Yeah.
But this is also another very important point.
So the fact that the civilian leader of the major generation dies before the military leader reinforces the contingency of history.
It didn't have to be that he died first, but then that sort of changes potentially the whole trajectory of history.
But going back to the question of, okay, what exactly was...
Different between the Russo-Japanese War and World War II, what was Japan's miscalculation?
One of them has to be that in the case of Russia, they attacked an adversary that was, it took one year of conflict to push them to the point of significant destabilization of the government.
If you thought the same thing would happen with America, we're going to attack Pearl Harbor and there's going to be all kinds of internal conflict.
And they just misjudged the internal level of cohesion and functionality of American institutions, right?
They thought the same thing would happen in Russia.
I want to go back to the Meiji reforms.
So the Iwakura mission, which is sent to Russia to understand what did Bismarck do right.
I think it's funny because Sergey White, who's the finance minister, 1892 to 1902, he models himself as Russia's Bismarck.
He wants to do the kind of industrialization that Germany went through in Russia.
And yet the Meiji generation succeeds in Japan in doing this kind of thing.
In both cases, there's an authoritarian government that wants to have this kind of top-down reform.
Why does it work in Japan but not in Russia?
Okay, so this is a quote from Sun Yat-sen, who was the first and short-lived leader of the Chinese Republic in 1912.
He says, quote, "...we regarded the Russian defeat by Japan as a defeat of the West by the East.
We regarded the Japanese victory as our own victory."
Now, we know that just 10 years prior, China and Japan had fought a war together.
We know that in 20 years, there's going to be a brutal invasion of China by Japan where millions of Chinese will die.
So help me understand why during this period, at least the Chinese revolutionaries and reformers, it seems like Japan is the power to emulate.
Yeah.
Something that wasn't emphasized in the lecture, but seems quite important to explaining why Japan was able to be so dominant in this period is despite the fact that it has a much smaller population than China.
So China this time is 400, 450 million people.
Russia is under 30 million.
What's going on in Japan of major reforms?
What specifically are they enabling?
Yeah.
And if you look at the fraction of GDP that China is spending on defense, the country sort of imploded.
So it's like less than 1%.
Russia is also not spending nearly as much of its output on defense.
Japan in the preceding 10 years is spending 5% of output on defense.
And in 1904, spending 10% of output on defense.
And so that ability to marshal and it's a smaller place, but you can
You can marshal the whole resource towards this defense effort.
Seems like very important to understand.
Talk about the long-run institutional effects of history, what you just mentioned about the provincial governments controlling a large share of Chinese resources.
You go from that period, so under the Qing Dynasty, then you have the Republic, then you have the Warring States period, then you have the Japanese in control of parts of the country, and then you have the communists, or nationalists and then communists.
Yeah.
But after five different regime changes, this dynamic in China, today in the U.S., 50% of output, like government spending is done by the federal level, 50% is done by the state level.
And in China, it's 85% of the local level, 15% of the national level.
And a big problem recently has been the fact that the local government has financed a lot of construction that wasn't economically valuable.
But this
You can go back 100 years to this period, more than 100 years to this period, and the same problem of provincial governance still comes up again.
So...
Part of the reason that Russia loses is that this is not the main priority for them.
But if that's the case, before the war, Japan produces agreement which would allow Russia to maintain their sphere of control over Manchuria, Japan to maintain their sphere of control over Korea, and then Russia refuses this.
Right.
And that's why Japan decides to do the surprise attack.
Yeah.
Obviously, after the war, Russia is in a much more disadvantageous position.
But do we understand why they were not game to consider this before the war?
It is a pretty cool story, though.
I almost got assassinated by a samurai and here's my scar.
I want to better understand the...
degree to which the level of technology in a country matters for its military capabilities because at this time
countries are just buying armaments from more advanced countries.
So China in the 1894 war, China has bought two German battleships, many other vessels.
Japan has no battleships.
And then in the 1904, 1905 war with Russia, both Japan and Russia have British ships because of the alliance that Japan had with Britain.
And then Russia also has French ships.
So if they're able to procure the most advanced weaponry from the West...
Does it matter what their level of technological development locally is?
Speaking of winning, after the war, as you mentioned, Theodore Roosevelt mediates the peace treaty between Russia and Japan, and he wins the Nobel Peace Prize for this.
And the Treaty of Portsmouth is heavily criticized in Japan in the following years for not giving them the concessions they feel are warranted as a result of their victory.
I don't think there's any reparations.
There's many territorial disputes that don't end up landing in Japan's corner.
So potentially a lesson to learn here is that you should...
if you give two relaxed peace terms, then the person who wins will feel aggrieved and they'll go at it again.
Although the other lesson, you know, if you have a Treaty of Versailles where it seems that you should learn the opposite lesson, that if you have two aggressive peace terms, then the person who lost will come back and have this sense of being aggrieved.
And this is a general problem in history.
It feels like there's an equal and opposite lesson
Is there a lender of last resort today?
Maybe this is a subtle but recurring disagreement between us that has come up in different Q&As.
But I do want to find the crux.
Is that...
It seems like the quality of institutions is central in your thinking of how well a country will do or how central it will be in the future.
And for me, it's much more just like how big is your economy?
How big is your population?
And so let me just finish the thought here.
If you think that the center of, I don't know, the Western alliance shifts to Europe, maybe you think like, oh, their institutions are good and that's why it'll happen.
But it seems to me America is just such a big and powerful country economically that it would be difficult for Europe to displace us, even if we make these mistakes in terms of our alliances and diplomacy.
Its economy sucks.
I think if its economy was better, I think it might be.
That's a result of mistakes in domestic policy with Peronism, not with its mistakes in foreign engagement.
I agree that domestic policies matter a lot.
Agreed.
So this is great because then we're identifying the real crime.
So maybe the crux is we both agree that institutions matter.
I think they really matter because they shape the trajectory of economic growth.
Okay.
But I think like the sort of like effects on, right.
But so I think Europe's institutions are still bad because the rate of economic growth in Europe is quite bad.
Whereas you might think they're better than I think they are because they're like good at diplomacy.
Yeah.
Are you getting from some audience questions?
Okay, Anonymous asks...
What is a good grand strategy for minor power that wants the status quo to continue?
For example, President of South Korea probably does not want China and the U.S.
fighting over Taiwan.
Denny Tremkev asks, what made Japan a more fertile ground for Western ideas than China?
Was China just too big for meaningful institutional change in the same time scale?
This leads in very nicely to the next question.
Lydia Dean asks, in the spirit of the Meiji delegates sending out to explicitly study the West, where would the United States do well to send delegates today?
This is actually a very interesting take you gave to me offline once, which was that the WikiLeaks
leaks of the State Department documents, you were like, oh, wow, they're kind of on it.
These seem like good memos.
All right.
Next month, we're doing a podcast with you and Elon together.
Maybe we should save that for the next lecture.
Welcome to San Francisco, Sarah.
Great place to close.
Thank you for tuning in.
I'll see you on the next one.
My guest today is Stephen Kotkin, who is a senior fellow at the Hoover Institution and author of two-thirds of his three-volume Stalin biographies.
The first one, Stalin Paradoxes of Power, the second one, Stalin Waiting for Hitler.
Thank you for coming on my podcast.
Let's begin with the Tsarist regime.
So first question, how repressive was the Tsarist regime actually?
Because presumably the motivation behind the revolution is to get rid of this autocracy.
But you just have these examples of these, Lenin's brother tries to kill the Tsar and he himself is writing these long manifestos about taking down capitalism and overthrowing the government.
And him and people like Stalin are just in exile in Siberia, living off government money, robbing banks, small shenanigans.
Honestly, it sounds more forgiving than many countries today.
So how bad was it really?
I thought this was one of the most interesting takes in your first volume, that modernization is not this inevitable process, but is instigated by this ruthless geopolitical competition.
Do you think that that still applies in today's world?
Because, yes, there are pockets of conflict in the Middle East or in Ukraine, which would motivate the key powers there to want to have modern militaries and modern technologies.
But through most of the world, the odds that if France falls behind technologically, if their AI is worse, that Germany is going to take over is just sort of unthinkable.
So this dynamic where in order to ward off colonization or other great powers, you need to stay at the cutting edge of technology and also have the up-to-date political processes.
Is that still a drive which moves countries forward?
I was thinking, is one of the key lessons from your volumes that you should be tripping over yourself in order to embrace a lesser of two evils and whether that applies through all the examples you give.
And this is maybe a general question about how much can you actually learn from history?
Because for every seeming lesson, there's an equal and opposite lesson that you can also learn.
So during the czarist regime, in retrospect, we can say that the liberals and the constitutionalists should have cooperated with Stoipan or White.
And even though it was an autocratic regime, they were actually doing these real reforms and there was growth and they should have continued that process.
Or when the government falls in February 1917, the provisional government faction should have united to oppose the Bolsheviks.
But then there's all these other examples.
In Germany, the conservative Weimar government is allies with Hitler in order to fend off what they think is the greater evil, which is the communists.
And given the events up to that point, it's a reasonable concern to have, given what the Bolsheviks have done in Russia.
So where should we end up on this?
Should you embrace the lesser of two evils whenever you get the chance or no?
It worked in Taiwan and South Korea.
There was an era of industrialization under, not a dictator, but an authoritarian government, and then they were able to transition to rule of law democracy.
But there's a paradox here where if you institute this sort of revolution or changing of the guard during the mass age, then you're going to get this sort of leftist revolution, which is very antithetical to future prosperity and rule of law.
But on the other hand, if you don't have a changing of the regime,
You will fail to be able to... So it's been pointed out that Chiang Kai-shek, when he did have control over China, should have done some amount of land reform.
And you talk about how Stoipan, after the 1905, attempted to put in this sort of agricultural reforms, but their success was mixed because the existing aristocracy obviously didn't favor.
So there is this paradox of if you don't change the regime, the existing stakeholders will not want the kinds of reforms which would make it possible to have a lower class that's bought into the system.
Yeah, what's the answer to that?
Because you say that in 1917, a leftist revolution of some kind was inevitable, but that it didn't have to be the October Bolshevik Revolution.
So why was leftism inevitable in Russia at that point?
This answers one of the other questions I had for you, which is why did we see these communist revolutions in peasant countries, which is the opposite of Marxist prediction that you would first need capitalism and industrialization before you would see the turn toward socialism?
And I guess the answer is that the private property, which is engendered by capitalism and industrialization,
actually helps the peasants more or helps them somewhat and buys them into the system.
But this raises another question, which is if it's the case that all of this unrest is caused by the mistreatment of peasants in China, in Russia, you have the mistreatment of them to an extent unimaginable after the collectivization in 1928, where there are literally 100 million peasants are enslaved.
And of course, there's some lack of cooperation with the regime.
They kill half the livestock and so forth.
But it doesn't break the regime, even though it's way more repressive and destructive than anything the czar did.
So if the peasants are the backbone of the regime's stability, why doesn't collectivization in China and Russia break the regime?
Even there, though, like Lenin, Trotsky, and Stalin, they're sent into exile.
But they're not only living off government money, but while in exile, they're writing for Pravda.
They're writing, you know, here's my manifesto on the fall of capitalism.
So even the intellectuals are not really repressed.
Can I ask about that?
How do we explain this surplus of sadism during this period in Russia where the 25,000ers who Stalin recruits to go out to the countryside and steal from basically starving people?
And they can visibly see, I'm sure, that they're stealing from a family that's going to starve without this grain.
You have tens of thousands, maybe hundreds of thousands of interrogators and torturers in this gulag system.
They must know it's a cynical thing where they're making them confess to a thing that they haven't done and they're employing torture to do it.
It wasn't just Stalin doing all these heinous things.
There were hundreds of thousands, maybe millions of people, including if you include informants, probably millions of people who are implicated in this whole ghoulish regime.
So is this just a latent thing that is true in any society and Stalin was able to exploit it or was some circumstance, uh,
I get if you're a middle bureaucrat in the Communist Party, sure.
Do you think that explains the motivation of an interrogator and a gulag?
They're like, oh, this is part of the end goal of communism.
But in many of these cases, they know that it's they're the ones orchestrating the sort of show trial, the cynical game where they know that they just picked up a random person in the dead of night.
There's ideology, but there's a very specific thing to these Marxist regimes where people
They might believe in class conflict and you need this revolution and so forth.
But there's also this sense of you cannot contradict the party.
You cannot contradict the vanguard.
So even in 1924, when Trotsky is getting condemned by the party or whatever that was, and he gets up to give a speech to the party plenum, right?
And he says, look, for all of my thoughts, let it be no mistake that the party is always right and party discipline is always important.
There's not only the sense where I think Mamdani would say, oh, I want these specific policies implemented, but the sense that
Also, loyalty to the party and eventually to Stalin, even when it seems to contradict my understanding of socialism, is absolutely paramount.
And one way to explain that is that they were just genuinely afraid of Stalin and they thought this was antithetical to their understanding of communism.
Or another is that part of the ideology is this theocratic understanding of the party's always right, even if it seems like a single individual is manipulating it to their ends.
All the anti-Marxism I totally agree with, but I still think from a purely just like analyzing the system, there's many ideologies.
And, you know, you have this line that you often have that, look, you can't explain Stalin by saying that he was beaten by as a kid or he's a Georgian or whatever, because many other people are Georgians or beaten as a kid.
And
They turn out not this way.
There's many different kinds of ideologies.
For sure.
And very few of them end up as amenable to dictatorship as Marxism.
And you also, another thing that's really confusing here is that all of these old Bolsheviks who abet the system and who eventually stall and purges, whatever you might say about them, they're not weak men, right?
They were willing to face down the Tsar.
They were able to organize the revolution against the Tsar.
And they're willing to live in exile to potentially get shot by the Oran- I guess not shot by the Oranka, but you know, whatever.
They're willing to go through hardships for their beliefs.
So you might think, well, okay, they might just go along with Stalin's doings because this serves what they think is the end goal of communism.
But we know that after Stalin died, Khrushchev, who was one of the key people in the regime, gives a secret speech where he says that, no, Stalin was going against- he was destroying the building of socialism and the building of Marxism-Leninism.
So people did believe that Stalin is actually going against this end goal that they have.
At least Khrushchev believed that.
And there's also, in many cases, they themselves are being implicated and they know they're innocent.
And in many of these cases, there's this period in between when they're a dead man walking because Stalin has started putting the Met feelers out that this person is a Trotskyite or something.
But they're still in their positions of power.
They're still the editor of Pravda or in charge of the military or something.
And it's mysterious why these people who – they're not cowards.
They were able to organize a revolution against the Tsar – are not using this period of, you know, a chicken with its head cut off in order to organize some sort of defense of themselves.
Maybe the next time they have the party plenum, instead of just confessing or giving the obligatory –
where you're casting it yourself.
You just say, no, I think Stalin's leading the revolution wrong.
I'm going to die either way, but I might as well say this.
The same thing happens in China.
Li Shaikou, when he's a dead man walking, the premier under Mao during the Cultural Revolution, he doesn't use that opportunity to go up to the... There are very few, but there are some people like that.
But what about the cases where they're implicating themselves?
They know they're not an enemy, right?
What fraction of confessions by high-level party members do you think were not coerced out of a sense of fear of their own lives or their family's lives?
Well, I guess they knew they were going to execute it.
So it would have to be for their family's life or to avoid torture versus in order – it was a very sort of Ozymandian, like, I will sacrifice myself for the – We're in the level of psychology here, DK.
Yeah.
That is to say- We will have real slavery.
Lighthouse is the fastest immigration solution for people building technology.
When I worked with them to help me secure a visa for one of my employees, the difference was clear throughout the whole process, starting even with the initial intake form.
While most immigration firms require you to schedule a meeting and pay hundreds of dollars just to start discussing your visa options,
Lighthouse only needs a LinkedIn profile or a resume to evaluate your eligibility for a ton of different employment visas like the EB-1A or the O-1A.
You take 60 seconds to fill out some basic information and they'll put together a document for you for free, which is fully comprehensive.
They'll list out all the potential paths available to you and how strong of a candidate you are for each one.
Also, if you decide to move forward, Lighthouse can take things all the way to the finish line, building out your story, preparing your application for you, and moving things forward faster than you would have thought possible.
If you've ever even considered working in the United States, I really encourage you to reach out.
You're probably more eligible for a visa than you realize.
Go to lighthousehq.com slash get started to find out.
All right, back to Stephen.
I guess in China's case, they actually did reform the system and didn't just, they didn't just discredit the Cultural Revolution.
They said, no, much of the planning and state-owned enterprises was a mistaken idea.
But I do have a different question.
But if that's the reason why they weren't able to do planning, shouldn't Stalin's purges and then World War II have also had the same effect on the Soviet Union?
I mean, I agree with the mechanism by which the growth happened, but I don't think it's a case that—
It was their inability to have true Marxist communism, which led to liberalization.
I mean, if you look at the...
the creation of these special economic zones, the imperative at a national level that you must have growth.
And then Deng's southern tour.
And so Zhang Zemin, he tries after Tiananmen to clamp down on these protests, clamp down on opening up.
And Deng says, no, we must open up.
If you don't, you'll remove you.
All of that is a sort of positive, maybe positive is the wrong word, but... Policy-driven.
Yeah, it's...
It's a special effort you have to make towards economic liberalization.
It didn't just happen by default.
People really had to push for it because the alternative story, which seems you're saying, is no, it's just that they physically could not enforce communism anymore.
But then the creation of that zone had to be a proactive action.
You can only have three employees at the right corporation.
But that pushback must come from, is coming from within, this is your thesis in Uncivil Society, right?
Yes.
That it is coming from within the system.
Because they could have, I mean, in 1976, they're like North Korea, literally.
And North Korea still exists, right?
There's no reason.
I mean, I agree with their general point that how any nation gets wealthy is not by the government, but because of the thrift and entrepreneurialism and hard work of individuals.
But that's also true in Western capitalist countries.
In those countries, we also have a lot of stupid policies.
As we sit here and speak.
When we say America is a capitalist country, what we say is like,
the government or all the bureaucrats, they'll try to put in all these regulations.
And it's only grudgingly that they will accede to, you know, we could point to a bunch of stupid policies in America where like they try to outlaw the potatoes and the onions, but they could only outlaw the potatoes, equivalence things.
So any capitalist society, quote unquote,
is just a case where the government had to accede some amount of control.
And we give credit to the countries in the West for saying, like, at least the government wasn't maximally stupid.
I guess my point is that
In any country in the world today where there's a lot of poverty, the reason the poverty exists is also because of policy.
And the extent the poverty has been removed, it is because some combination of human capital and policy got less stupid.
So if we're going to complain about a country like – there's many poor countries in the world, like Bangladesh being poor.
The country which just does it less – maybe we're going in circles here.
A different question I want to ask is –
Yeah.
Suppose Stalin had lost the succession battle in 1924 and somebody else is in power, but he's still on the Central Committee or the Politburo.
And it's 1930.
And suppose the other person is also in this way ruthless and is one by one getting rid of every single person in the inner circle.
What would a Stalin type figure have done if he found himself on the periphery of somebody else's regime?
And just to put a final point on my question, I mean, not just in the sense of whether collectivization would have happened, but more in the sense of
How would he personally have avoided the fate of Bukharin and Kamenev and Zinoviev in terms of potentially I'm going to get purged someday.
I don't want to be the toady to somebody else.
How would he personally have navigated the sort of power struggle at being what Zinoviev was to Stalin or Bukharin was Stalin?
Yeah, I don't mean like whether they would have done collectivization.
I mean like how would he personally, because he wants to be in power.
I guess my question is slightly different, which is that even if such a person did not exist,
And suppose, like, Stalin already exists, he did all this stuff, and it's 1934, and, you know, it seems like Stalin's starting to go a little great-terror-y soon.
And another copy of Stalin is in the Politburo, and just out of a sense of self-preservation, they're like...
In a couple of years, I don't want to be writing my own confession and ending up in the gulag.
Is Stalin being the sort of power player that he was and knowing how to align factions against each other to his own advantage in the very end?
If somebody like him was in the Politburo, what would they have done?
Or were they already there and there was nothing they could do by this point?
But sorry, why doesn't this prevent... The Tsar is like... People are trying to kill the Tsar constantly.
They're killing Russian ministers in the Tsarist regime.
But never against Stalin.
Why is that?
But there's a bunch of revolutionaries who try to kill the Tsar and sometimes succeed.
And they're just like random people.
They're not like people in the regime.
They're just random people.
Yes.
Why doesn't the Kulak, one of the hundred million enslaved people... The Tsar has less security than Stalin does.
But didn't you say in the Kana that in 1928 he had like one bodyguard when he would go to his dacha?
Yeah.
Yeah.
So you have, you've written other books about the collapse of the Soviet Union.
And there's this last ditch effort in the Eastern Bloc where there's falling productivity to borrow more money, invest more into finding this last ditch technological miracle that can cure all their problems.
How similar is that, in your opinion, to what's happening in China?
Because the dissimilarity is that while Eastern Europe was struggling to export and they had a trade deficit, China, many people argue, is exporting too much.
Do you see any similarity between where Eastern Europe was in 1989 versus where China is today?
Or is it you're not as concerned about China right now?
But do you need it?
Like Stalin didn't have strong growth in the 20s and 30s.
And it seems like you just double down on repression.
Like if you double down on the NKVD.
Yeah.
Like the czar actually, you know, 2% growth up to 1917.
They're dead.
They're in the cemetery.
That's the point, right?
I guess we'd like to think that that's the main thing that matters.
But historically, it just seems like when authoritarians crack down really hard, it kind of just works.
All right, great note to close on.
Thank you so much for coming on the podcast.
It was a real pleasure to talk to you.
Yes.
The long run or the long drawdown is why I want you on the podcast, right?
A lot of these issues are complicated.
So I appreciate you doing it.
Thank you for tuning in.
I'll see you on the next one.
Okay, this is a narration of a blog post I wrote on June 3rd, 2025, titled, Why I Don't Think AGI Is Right Around the Corner.
Quote, Things take longer to happen than you think they will, and then they happen faster than you thought they could.
Rudiger Dornbusch I've had a lot of discussions on my podcast where we haggle out our timelines to AGI.
Some guests think it's 20 years away, others two years.
Here's where my thoughts lie as of June 2025.
Continual Learning
Sometimes people say that even if all AI progress totally stopped, the systems of today would still be far more economically transformative than the internet.
I disagree.
I think that the LLMs of today are magical, but the reason that the Fortune 500 aren't using them to transform their workflows isn't because the management is too stodgy.
Rather, I think it's genuinely hard to get normal human-like labor out of LLMs.
And this has to do with some fundamental capabilities that these models lack.
I like to think that I'm AI forward here at the Thorcash podcast, and I've probably spent on the order of 100 hours trying to build these little LLM tools for my post-production setup.
The experience of trying to get these LLMs to be useful has extended my timelines.
I'll try to get them to rewrite auto-generated transcripts for readability the way a human would, or I'll get them to identify clips from the transcript to tweet out.
Sometimes I'll get them to co-write an essay with me, passage by passage.
Now, these are simple, self-contained, short-horizon, language-in, language-out tasks.
The kinds of assignments that should be dead center in the LLM's repertoire.
And these models are 5 out of 10 at these tasks.
Don't get me wrong, that is impressive.
But the fundamental problem is that LLMs don't get better over time the way a human would.
This lack of continual learning is a huge, huge problem.
The LLM baseline at many tasks might be higher than the average human's, but there's no way to give a model high-level feedback.
You're stuck with the abilities you get out of the box.
You can keep messing around with the system prompt, but in practice, this just does not produce anywhere close to the kind of learning and improvement that human employees actually experience on the job.
The reason that humans are so valuable and useful is not mainly their raw intelligence.
It's their ability to build up context, interrogate their own failures, and pick up small improvements and efficiencies as they practice a task.
How do you teach a kid to play a saxophone?
Well, you have her try to blow into one and listen to how it sounds and then adjust.
Now, imagine if teaching saxophone worked this way instead.
A student takes one attempt, and the moment they make a mistake, you send them away and you write detailed instructions about what went wrong.
Now the next student reads your notes and tries to play Charlie Parker cold.
When they fail, you refine your instructions for the next student.
This just wouldn't work.
No matter how well honed your prompt is, no kid is just going to learn how to play saxophone from reading your instructions.
But this is the only modality that we as users have to teach LLMs anything.
Yes, there's RL fine-tuning, but it's just not a deliberate adaptive process the way human learning is.
My editors have gotten extremely good, and they wouldn't have gone that way if we had to build bespoke RL environments for every different subtask involved in their work.
They've just noticed a lot of small things themselves and thought hard about what resonates with the audience, what kind of content excites me, and how they can improve their day-to-day workflows.
Now, it's possible to imagine some ways in which a smarter model could build a dedicated RL loop for itself, which just feels super organic from the outside.
I give some high-level feedback, and the model comes up with a bunch of verifiable practice problems to RL on, maybe even a whole environment in which to rehearse the skills it thinks it's lacking.
But this just sounds really hard, and I don't know how well these techniques will generalize to different kinds of tasks and feedback.
Eventually, the models will be able to learn on the job in the subtle organic way that humans can.
However, it's just hard for me to see how that could happen within the next few years, given that there's no obvious way to slot in online continuous learning into the kinds of models these LLMs are.
Now, LLMs actually do get kind of smart in the middle of a session.
For example, sometimes I'll co-write an essay with an LLM.
I'll give it an outline, and I'll ask it to draft an essay passage by passage.
All its suggestions up to paragraph four will be bad.
And so I'll just rewrite the whole paragraph from scratch and tell it, hey, your shit sucked.
This is what I wrote instead.
And at that point, it can actually start giving good suggestions for the next paragraph.
But this whole subtle understanding of my preferences and style is lost by the end of the session.
Maybe the easy solution to this looks like a long rolling context window, like Claude Code has, which compacts the session memory into a summary every 30 minutes.
I just think that titrating all this rich, tacit experience into a text summary will be brittle in domains outside of software engineering, which is very text-based.
Again, think about the example of trying to teach somebody how to play the saxophone using a long text summary of your learnings.
Even cloud code will often reverse a hard-earned optimization that we engineered together before I hit slash compact, because the explanation for why it was made didn't make it into the summary.
This is why I disagree with something that Sholto and Trenton said on my podcast.
And this quote is from Trenton.
If AI progress totally stalls today, I think less than 25% of white-collar employment goes away.
Sure, many tasks will get automated.
Cloud for Opus can technically rewrite auto-generated transcripts for me.
But since it's not possible for me to have it improve over time and learn my preferences, I still hire a human for this.
Even if we get more data, with our progress in continual learning, I think that we will be in a substantially similar position with all of white-collar work.
Yes, technically, AIs might be able to perform a lot of subtasks somewhat satisfactorily, but their inability to build up context will make it impossible to have them operate as actual employees at your firm.
Now, while this makes me bearish on transformative AI in the next few years, it makes me especially bullish on AI over the next few decades.
When we do solve continuous learning, we'll see a huge discontinuity in the value of these models.
Even if there isn't a software-only singularity with models rapidly building smarter and smarter successor systems, we might still see something that looks like a broadly deployed intelligence explosion.
AIs will be getting broadly deployed through the economy, doing different jobs and learning while doing them in the way that humans can.
But unlike humans, these models can amalgamate their learnings across all their copies.
So one AI is basically learning how to do every single job in the world.
An AI that is capable of online learning might functionally become a superintelligence quite rapidly without any further algorithmic progress.
However, I'm not expecting to see some open AI livestream where they announce that continual learning has totally been solved.
Because labs are incentivized to release any innovations quickly, we'll see a somewhat broken early version of continual learning or test time training, whatever you want to call it, before we see something which truly learns like a human.
I expect to get lots of heads up before we see this big bottleneck totally solved.
Computer use.
When I interviewed anthropic researchers Shilto Douglas and Trenton Bricken on my podcast, they said that they expect reliable computer use agents by the end of next year.
Now, we already have computer use agents right now, but they're pretty bad.
They're imagining something quite different.
Their forecast is that by the end of next year, you should be able to tell an AI, go do my taxes.
It goes through your email, Amazon orders, and Slack messages.
And it emails back and forth to everybody you need invoices from.
It compiles all your receipts.
It decides which things are business expenses, asks for your approval on the edge cases, and then submits Form 1040 to the IRS.
I'm skeptical.
I'm not an AI researcher, so far be it to contradict them on the technical details, but from what little I do know, here are three reasons I'd bet against this capability being unlocked within the next year.
One, as horizon lengths increase, rollouts have to become longer.
The AI needs to do two hours worth of agentic computer use tasks before we even see if it did it right.
Not to mention that computer use requires processing images and videos, which is already more compute intensive, even if you don't factor in the longer rollouts.
This seems like it should slow down progress.
Two, we don't have a large pre-training corpus of multimodal computer use data.
I like this quote from Mechanize's post on Automated Ink Software Engineering.
Quote, for the past decade of scaling, we've been spoiled by the enormous amount of internet data that was freely available to us.
This was enough to crack natural language processing, but not for gaming models to become reliable, competent agents.
Imagine trying to train GPT-4 on all the text-to-data available in 1980.
The data would be nowhere near enough, even if you had the necessary compute."
Again, I'm not at the lab, so maybe text-only training already gives you a great prior on how different UIs work and what the relationships are between different components.
Maybe RL fine-tuning is so sample efficient that you don't need that much data.
But I haven't seen any public evidence which makes me think that these models have suddenly become less data hungry, especially in domains where they're substantially less practiced.
Alternatively, maybe these models are such good front-end coders that they can generate millions of toy UIs for themselves to practice on.
For my reaction to this, see the bullet point below.
Three, even algorithmic innovations, which seem quite simple in retrospect, seem to have taken a long time to iron out.
The RL procedure, which DeepSeq explained in their R1 paper, seems simple at a high level.
And yet it took two years from the launch of GPT-4 to the launch of O1.
Now, of course, I know that it's hilariously arrogant to say that R1 or O1 were easy.
I'm sure a ton of engineering, debugging, and pruning of alternative ideas was required to revive the solution.
But that's precisely my point.
Seeing how long it took to implement the idea, hey, let's train our model to solve verifiable math and coding problems.
makes me think that we're underestimating the difficulty of solving a much gnarlier problem of computer use, where you're operating on a totally different modality with much less data.
Reasoning.
Okay, enough cold water.
I'm not going to be like one of these spoiled children on Hacker News who could be handed a golden egg-laying goose and still spend all their time complaining about how loud its quacks are.
Have you read the reasoning traces of O3 or Gemini 2.5?
It's actually reasoning.
It's breaking down the problem.
It's thinking about what the user wants.
It's reacting to its own internal monologue and correcting itself when it notices that it's pursuing an unproductive direction.
How are we just like, oh yeah, of course machines are going to go think a bunch, come up with a bunch of ideas and come back with a smart answer.
That's what machines do.
Part of the reason some people are too pessimistic is that they haven't played around with the smartest models operating in the domains that they're most competent in.
Giving Claude Code a vague spec and then sitting around for 10 minutes until it's zero shots of working application is a wild experience.
How did it do that?
You could talk about circuits and training distributions and RL and whatever, but the most proximal, concise, and accurate explanation is simply that it's powered by baby artificial intelligence.
At this point, part of you has to be thinking, it's actually working.
We're making machines that are intelligent.
Okay, so what are my predictions?
My probability distribution is super wide.
And I want to emphasize that I do believe in probability distributions, which means that work to prepare for a misaligned 2028 ASI still makes a ton of sense.
I think that's a totally plausible outcome.
But here are the timelines at which I'd make a 50-50 bet.
An AI that can do taxes end-to-end for my small business as well as a competent general manager could in a week, including chasing down all the receipts on different websites and finding all the missing pieces and emailing back and forth with anyone we need to hassle for invoices, filling out the form and sending it to the IRS.
2028.
I think we're in the GPT-2 era for computer use, but we have no pre-training corporates and the models are optimizing for a much smarter reward over a much longer time horizon using action primitives that they're unfamiliar with.
That being said, the base model is decently smart and might have a good prior over computer use tasks.
Plus, there's a lot more compute and AI researchers in the world, so it might even out.
Preparing taxes for a small business feels like for computer use what GPT-4 was for language.
And it took four years to get from GPT-2 to GPT-4.
Just to clarify, I'm not saying that we won't have really cool computer use demos in 2026 and 2027.
GPT-3 was super cool, but not that practically useful.
I'm saying that these models won't be capable of end-to-end handling a week-long and quite involved project, which involves computer use.
Okay, and the other prediction is this.
An AI that learns on the job as easily, organically, seamlessly, and quickly as a human for any white-collar work.
For example, if I hire an AI video editor, after six months, it has as much actionable, deep understanding of my preferences, our channel, and what works for the audience as a human would.
This, I would say, 2032.
Now, while I don't see an obvious way to slot in continuous online learning into current models, seven years is a really long time.
GPT-1 had just come out this time seven years ago.
It doesn't seem implausible to me that over the next seven years, we'll find some way for these models to learn on the job.
Okay, at this point, you might be reacting, look, you made this huge fuss about how continual learning is such a big handicap.
But then your timeline is that we're seven years away from what at a minimum is a broadly deployed intelligence explosion.
And yeah, you're right.
I'm forecasting a pretty wild world within a relatively short amount of time.
AGI timelines are very log-normal.
It's either this decade or bust.
Not really bust, more like lower marginal probability per year, but that's less catchy.
AI progress over the last decade has been driven by scaling training compute for frontier systems over four acts a year.
This cannot continue beyond this decade, whether you look at chips, power, even the raw fraction of GDP that's used on training.
After 2030, AI progress has to mostly come from algorithmic progress.
But even there, the low-hanging fruits will be plucked, at least under the deep learning paradigm.
So the yearly probability of AGI craters after 2030.
This means that if we end up on the longer side of my 50-50 bets, we might well be looking at a relatively normal world up to the 2030s or even the 2040s.
But in all the other worlds, even if we stay sober about the current limitations of AI, we have to expect some truly crazy outcomes.
Many of you might not be aware, but I also have a blog and I wanted to bring content from there to all of you who are mainly podcast subscribers.
If you want to read future blog posts, you should sign up for my newsletter at thewarkash.com.
Otherwise, thanks for tuning in and I'll see you on the next episode.
Today, I have the pleasure of interviewing George Church.
I don't know how to introduce you.
It would honestly, this is not even a exaggeration, it would honestly be easier to list out the major breakthroughs in biology over the last few decades that you haven't been involved in, from the Human Genome Project to CRISPR, age reversal to de-extinction.
So you weren't exactly an easy prep.
Sorry.
Okay, so let's start here.
By what year would it be the case that if you make it to that year,
Technology in bio will keep progressing to such an extent that your lifespan will increase by a year every year or more.
Right.
Given the number of things you would have to solve to give us a lifespan of humpback whales.
Ship of Theseus kind of thing in the brain?
Is there an existing gene delivery mechanism which could deliver gene therapy to every single cell in the body?
You're one of the co-founders of Colossus, which recently announced that they de-extincted a dire wolf, and now you're working on the woolly mammoth.
Yeah.
Do you really think we're going to bring back a woolly mammoth?
Because the difference between an elephant and a woolly mammoth might be like a million base pairs.
How do you think about the kind of thing we're actually bringing back?
Does this teach us something interesting about phenotypes, which you think are downstream from many genes, are in fact modifiable by very few changes?
Basically, could we do this to other species or to other things you might care about, like intelligence, where you might think like, oh, there must be thousands of genes that are relevant, but there's like 20 edits you need to make really to be in a totally different ballgame.
What implications does this have for gene therapy in general?
What is preventing us from finding the latent knob for every single phenotype we might care about in terms of helping with disabilities or enhancement?
Is it the case that for any phenotype we care about, there will be one thing that is like HGH for height?
And how do you find it?
But just getting everybody to the healthy level, like how many...
How much gene therapy would that take?
It sounds like it wouldn't take that much if you think that there are these couple of knobs which control very high-level functions.
Do you find them through the GWAS genome-wide association studies?
Is it through simulations of these?
Most antifraud solutions focus on detecting and blocking bots.
And that's fine if your product is just meant to be used by humans.
But what if you actually want your product to be used by AI agents?
How do you distinguish between automated traffic that you want to allow and automated traffic that you need to block?
WorkOS Radar is built for this exact problem.
Radar is a powerful, humans-only filter that major companies like Cursor rely on to protect their product from bots.
But it can also handle more complex and granular tasks, like distinguishing desired versus malicious agents, rather than just blocking all automated traffic.
Even if you aren't building for agents yet, Radar helps you future-proof your product roadmap.
If you start using Radar for traditional fraud protection, when you do ship your first feature intended for agents, you can just update Radar's behavior with no engineering required.
Learn more about AI-native fraud prevention at workos.com slash radar.
All right, back to George.
Can I ask you some questions about biodefense?
Yeah.
Because some of the stuff you guys work on, or quite responsibly choose not to work on, can keep one up at night.
Mirror life.
Yes.
Given the fact that it's physically possible,
Why doesn't it just happen at some point?
Like some days it'll get cheap enough or some people care about it enough that somebody just does it.
What's the equilibrium here?
And what does that look like in terms of not just mirror life, but synthetic biology in general?
You know, maybe we're at an elevated period of the ratio to offense and defense, but how do we get to an end state where even if there's lots of
people running around with bad motivations, that somehow there's defenses built up that we would still survive, that we're robust against that kind of thing?
Or is such an equilibrium possible, or will offense always be privileged in this game?
And if we could really solve a lot of that stuff, we could reduce the probability that one person could- This is making me pessimistic, because you're basically saying we got to solve all of society's problems before we don't have to worry about synthetic biology.
Yeah.
Which I'm like, I'm not that optimistic about, like, we'll solve some of them.
You had an interesting scheme for remapping the codons in a genome so that it's impervious to naturally evolved viruses.
Is there a way in which this scheme would also work against synthetically manufactured viruses?
Which would limit the transmissibility?
Biology seems very dual use, right?
So the mere fact that you, like literally you, are making sequencing cheaper will just have this dual use effect in a way that's not necessarily true for nuclear weapons.
Right.
Yeah.
And we want that, right?
We want biotechnology.
I guess I am curious if there is some long-run vision where...
To give another example, in cybersecurity, as time has gone on, I think our systems are more secure today than they were in the past because we found vulnerabilities and we've come up with new encryption schemes and so forth.
Is there such a plausible vision in biology, or are we just stuck in a world where offense will be privileged and so we would just have to limit access to these tools and have better monitoring, but there's not a more robust solution?
I worked for five years with only one defector.
That's quite impressive.
Over the last couple of decades, we've had a million-fold decrease in the cost of sequencing DNA, a thousand-fold in synthesis.
We have gene editing tools at CRISPR, massive parallel experiments through multiplex techniques that have come about.
And of course, much of this work has been led by your lab.
Despite all of this, why is it the case that we don't have some huge industrial revolution, some huge burst of new drugs or some cures for Alzheimer's and cancer that have already come about?
When you look at other trends in other fields, right, like we have Moore's Law and here's my iPhone.
Why don't we have something like that in biology yet?
What exactly are we on the verge of?
What does 2040 look like?
How much more are we talking?
Are we going to have 10x the amount of drugs, 100x?
And what specifically is changing that's enabling this?
Is it just existing cost curves continuing or is it some new –
technique or tool that will come about?
So then what's the reason that over the last many decades, and we do have not atomic, but close to atomic level manufacturing with semiconductors.
40 nanometers.
Right.
It's quite small.
It's a thousand times bigger than biology, linearly.
But the progress we have made hasn't been related to biology so far.
It seems like we've made Moore's law happen before.
People in the 90s were saying, ultimately, we'll have these bio machines that are doing the computing.
But it seems like we've just been using conventional manufacturing processes.
What exactly is it that changes that allows us to use bio to make these things?
Publicly available data is running out, so major AI labs partner with Scale to push the boundaries of what's possible.
Through Scale's data foundry, major labs get access to high-quality data to fuel post-training, including advanced reasoning capabilities.
Scale's research team, SEAL, is creating the foundations for integrating advanced AI into society through practical AI safety frameworks and public leaderboards around safety and alignment.
Their latest leaderboards include Humanity's Last Exam, Enigma Eval, Multi-Challenge, and Vista, which test a range of capabilities from expert-level reasoning to multimodal puzzle solving to performance on multi-turn conversations.
Gale also just released Scale Evaluation, which helps diagnose model limitations.
Leading frontier model developers rely on Scale Evaluation to improve the reasoning capabilities of their best models.
If you're an AI researcher or engineer and you want to learn more about how Scale's data foundry and research lab can help you go beyond the current frontier of capabilities, go to scale.com slash thwarkash.
Okay, so speaking of protein design...
Another thing you could have thought in the 90s is, I mean, people were writing about nanotechnology, Eric Drexler and so forth.
And now we can go from a function that we want this tiny molecular machine to do back to the sequence that can give it that function.
Why isn't this resulting in some nanotech revolution?
Or will it eventually?
Why didn't AlphaFold cause that?
Do you have a prediction by when we'll see this material science revolution?
What is basically standing between?
Because we've got AlphaFold right now, right?
So what is the thing that we need?
Do we need more data?
If AlphaFold predicting the structure doesn't tell you whether the thing will actually function, then what is needed before I can say I want a nanomachine that does X thing or I want a material that does Y thing and I can just like get that.
Yeah.
It seems like
If I listen to these words, it seems like I should be expecting the world to physically look a lot different.
But then why are we only getting like a couple more drugs by 2040?
Are you more excited about AI, which thinks in protein space or a capsid space or like just...
It's like predicting some biological or DNA sequences.
Or are you more optimistic about just LLMs trained on language, which can write in English and tell you, here's the experiment you should run in English?
Which of those two approaches, or is there some combination that when you think about AI and bio is more promising?
I suppose we did build safe superintelligence.
How much would that speed up bioprogress?
There's a million George Churches in data centers just thinking all the time.
Is it a 10x speed up?
I mean, suppose you get them to care about it.
There's a million copies of you in a data center.
How much faster is bioprogress?
But they can't run experiments directly.
They're just in data centers.
They can just say stuff and think stuff.
But I'm curious what, you've still got to run the experiments, you still need these other things.
So does that bottleneck the impact of the millionth copy of you or do you still get some speed up?
How much faster can biology basically go if they're just like more smart people thinking, which is a sort of proxy for what EI might do?
You worked on brain organoids and brain connectome and so forth.
That work, how has it shifted your view on fundamentally how complex intelligence is?
In the sense of, are you more bullish on AI because I realize that organoids are not that complicated, or it's like very little information is required to describe how to grow it, or are you like, no, this is actually much more gnarly than I realized?
Given how little I knew about biology, my prep for this episode basically looked like one minute of trying to read some paper and then chatting with an LLM like Gemini 30 minutes afterwards and asking it to explain a concept to me using Socratic tutoring.
And the fact that this model has enough theory of mind
to understand what conceptual holes a student is likely to have and ask the exact right questions in the exact right order to clear up these misunderstandings is honestly been one of the most feel the AGI moments that I've ever experienced.
This is probably the single biggest change in my research process, honestly, since I started the podcast.
For this episode, I think I probably spent on the order of 70% of my prep time talking with LLMs rather than reading source material directly because it was just more useful to do it that way.
And given how much time I spend with Gemini in prep for these episodes, improvements in style and structure go a really long way towards making the experience more useful for me.
That's why I'm really excited about the newly updated Gemini 2.5 Pro, which you can access in AI Studio at ai.dev.
All right, back to George.
Going back to the engineering stuff,
Often people will argue that, look, you have this existence group that E. coli can multiply every or duplicate every 30 minutes.
Insects can duplicate really fast as well.
But then with our ability to manufacture stuff with human engineering, you know, we can do things that nothing in biology can do, like radio communication or fission power or jet engines, right?
So...
How plausible to you is the idea that we could have biobots, which can duplicate at the speed of insects and there could be trillions of them running around, but they also can have access to jet engines and radio communication and so forth.
Are those two things compatible?
What would it take to do
whole genome engineering to such a level that for even a phenotype which doesn't exist in the existing pool of human variation, you could manifest it because your understanding is so high.
For example, if I wanted wings,
Yeah, right.
Is the bottleneck our understanding?
Is the bottleneck our ability to make that many changes to my genome?
What discovery in biology, so not in astronomy or some other field, in biology, would make you convinced that life on Earth is the only life in the galaxy?
And conversely, what might convince you that no, it must have arisen independently thousands of times in this galaxy?
I'm curious, between intelligent life and some sort of primordial RNA thing,
What is the step at which, if there is any, where you say there's a less than 50% chance something like at this level exists elsewhere in the Milky Way?
If in a thousand years we're still using DNA and RNA and proteins for top-end manufacturing, the frontiers of engineering, how surprised would you be?
Would you think like, oh, that makes sense, evolution designed these systems for billions of years?
Or would you think like, oh, it's surprising that these ended up being the systems, whatever evolution found just happened to be the best way to manufacture or to store information or...
So it makes sense why evolution wouldn't have discovered radio technology, right?
But things like more than 20 amino acids or these different bases so that you can store more than two bits per base pair.
Or, for example, the codon remapping scheme, this redundancy, which it seems like, based on your work, you can...
there was this extra information you could have used for other things.
So is there some explanation for why four billion years of evolution didn't already give living organisms these capabilities?
So we've talked about many different technologies you worked on or are working on right now, from gene editing to de-extinction to age reversal.
What is an underhyped technology?
technology research portfolio, which you think more people should be talking about, but gets glossed over.
David Reich was talking about how in India, especially because of the long running history of caste and endogamous coupling, that there have been these small cell populations that have high amounts of recessive diseases.
And so like there it's especially,
valuable intervention.
Do you think genetic counseling is a more important intervention or even in the future will continue to have a bigger impact than even gene therapy for these monogenic?
All right, some final questions to close us off.
If 20 years from now,
If there's some scenario in which we all look back and say, you know what, I think on net it was a good thing that the NSF and the NIH and all these budgets were blown off and got doged and so forth.
I'm not saying you think this is likely, but suppose there ends up being a positive story told in retrospect.
What might it be?
Would it have to maybe become up with a different funding structure?
Basically, like...
Yeah, what is the best case scenario if this post-war system of basic research is upended?
This is a positive story?
What is it about the nature of your work, maybe biology more generally, that makes it possible for one lab to be behind so many advancements?
I don't think there's an analogous thing in computer science, which is a field I'm more familiar with, where you could go to...
One lab and one academic lab.
Yeah, sorry one academic lab and then a hundred different companies have been formed out of it including the ones that are most exciting and Doing a bunch of groundbreaking work.
So is it something about the nature of your academic lab?
Is it something about the nature of biology research?
What explains this pattern?
So yesterday I had a dinner with a bunch of biotech founders, and I mentioned that I was going to interview tomorrow.
And so somebody asked, wait, how many of the people here have worked in George's lab at some point or worked with him at some point?
And I think 70% or 80% of the people raised their hand.
And one of the people suggested, oh, you should ask him, how does he spot talent?
Because it is the case that many of the people who are building these leading companies or doing groundbreaking research have done research
have been recruited by you, have worked in your lab.
So how do you spot talent?
Final question.
Given the fast pace of AI progress, your point taken that we should be cautious of the technology, but by default, I expect it to go quite fast and there not being some sort of global moratorium on AI progress.
Given that's the case, what is the vision for... We're going to very plausibly have a world with genuine AGI within the next 20 years.
What is the vision for...
biology, given that fact?
Because if AI was 100 years away, we could say, well, we've got this research we're doing with the brain or with the gene therapies and so forth, which might help us cope or might help us stay on the same page.
Given how fast AI is happening, what is the vision for this bio-AI co-evolution or whatever it might look like?
Well, that's a good vision to end on.
Okay.
George, thank you so much for coming on.
Yeah, thank you.
Thank you for tuning in.
I'll see you on the next one.
Today, I'm interviewing Arthur Kroger, who is the founder of Gavkal Dragonomics, which is a research consultancy focused on China and author of China's Economy, What Everybody Needs to Know.
A friend while I was in China recommended it to me, and it's been the most valuable and useful resource that you can get today on how China works.
So, Arthur, thanks for coming on the podcast and taking the time to chat with me.
It's great to be here.
Thanks.
First question.
What really is the problem if China becomes as wealthy or if its economy grows as big as America's or grows even bigger?
I know maybe it's not your perspective to be a China hawk, but I've never really understood why this is a problem in the first place.
I mean, the trade surplus point, to the extent that it is made possible by the government involvement in industry, which is actually not even clear to me that that's the case.
I mean, just like if you have high savings and not enough investment domestically, just like the accounting identities are just that you will have a trade surplus.
Right.
But suppose that's even the case.
On paper, it just seems like what is happening, the Chinese taxpayer, the Chinese saver is subsidizing foreign importers.
So on paper, it just seems like we're getting a good deal.
I'm sure some people are upset about this, specifically people who manufacture outside of China.
Yeah.
But it's certainly not like something obviously insidious.
Right.
And so it seems like if it wasn't for this, there'd be some other reason that, you know, China can't grow as wealthy as us.
And I am playing a little devil's advocate here, but I just like I don't really understand why this is like such a big issue that there needs to be a great power competition about it.
Okay, so two points on the political system.
I wonder if we've learned a bad example from World War II and the Cold War, which is that the way in which great power conflict culminates is that the other person totally collapses.
Right.
I think that was actually necessary, obviously, for Hitler and even for the Soviet Union.
I think they were evil regimes.
I think China today is an evil regime in a way, but it's just not...
not in the same order as Stalin or Hitler.
And so the end state for any great power competition for America, you know, history is long, right?
So there will be more than just these in the X three decades.
Cannot be that if there's a different political system,
that it has to collapse the way that the Soviet Union collapsed or that Hitler collapsed.
And then there's the question about, regardless of their political system, there's this economic dislocation.
I mean, the first thing to note there is, this could be true of any country, so I think people can play these two arguments, whereas...
It's worth noticing that if Australia was producing everything the world consumed and had an economy the size of America's, these arguments should apply to them as well.
And I don't think people have the sense of, well, if Australia is producing a bunch of stuff for us, we need to form a coalition against them and have this adversarial attitude.
But suppose we did.
I think there's a question of, OK, how could you prevent this dislocation?
And is it Australia's fault?
The analogy is breaking down.
So let me just go back to China.
So there's low value-add manufacturing where labor cost is a big fraction of cost.
And that kind of stuff was shipped off to China.
But if it wasn't for China, there's many other countries in the world that have much lower labor costs than the U.S.
So if it wasn't for China, it'd be in like Vietnam or Bangladesh or something.
And then there's a high-tech manufacturing.
But there...
I haven't crunched the numbers, but if I were to guess, I don't think TSMC's leading edge, like what it costs to make a five nanometer wafer, I'm guessing very little of that is like the process engineers themselves.
And that is just like, can your country produce it?
When I was in China, I asked somebody I met –
what would happen if there's an election in China tomorrow?
And his answer was that it's possible that the median voter in China is, you know, much more reactionary than the government, right?
Yes, yes.
In fact, this might be much more liberal than the regime you would get out of a sort of democratic election.
Going back to this discussion about, well, what should the grand bargain between the rest of the world and China be?
I live in Silicon Valley.
And as you know, a big topic of conversation is AI, in particular, the race between China and US on AI.
And one idea I've heard is that what we should do is give them free reign on solar and electric vehicles, batteries, all this other...
real-world heavy manufacturing that they seem to have greater proclivity for anyways, that they consider more real, and then say on AI and semiconductors, well, look, this we want to dominate, everything else we'll import.
Is this a plausible deal worth making?
Because I think if you take AI very seriously, this might be just like an amazing bargain.
Publicly available data is running out, so major AI labs like Meta, Google DeepMind, and OpenAI all partner with Scale to push the boundaries of what's possible.
Through Scale's data foundry, major labs get access to high-quality data to fuel post-training, including advanced reasoning capabilities.
Scale's research team, SEAL, is creating the foundations for integrating advanced AI into society through practical AI safety frameworks and public leaderboards around safety and alignment.
Their latest leaderboards include Humanities Last Exam, Enigma Eval, Multi-Challenge, and Vista, which test a range of capabilities from expert-level reasoning to multimodal puzzle solving to performance on multi-turn conversations.
Scale also just released Scale Evaluation, which helps diagnose model limitations.
Leading frontier model developers rely on Scale Evaluation to improve the reasoning capabilities of their best models.
If you're an AI researcher or engineer and you want to learn more about how Scale's data foundry and research lab can help you go beyond the current frontier of capabilities, go to scale.com slash thwarkash.
It's funny because one of the main arguments that China hawks will make or one of the big parts of the worldview is that China really cheated us by our companies would invest in China, build up factories in China, the technology would get transferred there.
And this was like a huge arbitrage that China was able to pull.
And so if you actually believe that, it should just make the reverse of that incredibly compelling because you get a chance to do that again.
Yes.
Do you want to tell the story of BYD and Tesla in 2018 and 2019?
Because I think this really illustrates the point or what could be done in reverse very well.
How much was the $200, $300 billion actually relevant, given the fact that BYD, as you mentioned, is a private company?
Yeah.
Was that all – like, how much of that was actually necessary or counterfactually important to creating this outcome?
And why are they able to identify these kinds of sectors in advance?
Because...
Central planning isn't supposed to work, right?
And there's been many cases of countries which have tried to do this.
Germany, Japan, many other countries missed out on the internet because there was a centrally directed effort towards these heavy industries or manufacturing, which actually turned out not to be relevant in the 21st century.
Actually, it's maybe a couple of firms in the world which are able to make this prediction correctly.
It's like a couple of firms in the world plus the Ministry of Information Technology in China are the only people who can predict the future.
I guess I'm still confused on why many other countries have tried similar things.
Like, Needy was a Japanese version of this, right?
Yeah.
And look at their economy now and their high-tech sector.
I want to put some of these bureaucrats in a Factorio speedrun and see how they do.
LLM tokens are expensive.
So if you're building an AI tool, free trials of your product cost you real money.
And this is well worth it if you're serving actual users.
But if it's all going to bot signups, then it's actually a real notable waste.
Costco can hand out free samples, but not if the whole store is overrun with raccoons.
Unfortunately, AI products are the biggest targets for free tier abuse because these AIs can be used for anything.
There's crazy stories about people making free accounts on Cursor and then piggybacking off these API credits to generate fanfic.
So what can you do?
Well, you can't just put your entire product behind a paywall from the very beginning.
This is where WorkOS Radar comes in.
Radar blocks free trial abuse by distinguishing between real prospective users and malicious bots so that you can keep offering free trials without wasting a bunch of money in the process.
Cursor recently turned on Radar to make sure that their free trials weren't being used to generate random fanfic.
Radar currently runs millions of checks for them every single week.
You can learn more at workos.com slash Radar.
All right, back to Arthur.
I just interviewed Ken Rogoff, so Japan's top of mind for me.
That's really interesting that even though Japan had this export discipline, maybe the reason that the growth couldn't continue after the 80s is because you had this... Was it the convoy system where banks were incentivized to lend to people they knew in these conglomerates?
And these conglomerates had been around for decades, even before World War II.
Right.
Now...
A question I have about China is now that you do have these companies which have become national champions.
Right.
And are conglomerates, right?
The same companies producing a phone and a car and everything in between.
And given this sort of like...
intrinsic nature of maybe authoritarian systems or systems with financial repression?
And will we, again, will we return to, or I don't know if return is the right word, will we go to a system that is closer to what Japan had in the 80s where, okay, now you're going to give money to Huawei because Huawei are the people or BYD are the people, or will it remain dynamic, especially in the sectors which are coming up over the next few decades?
Absolutely.
I mean, this is just stuff from your book that I'm citing back to you, but you talk about these local government financing vehicles, which are backed.
So the local government takes out a lot of loans in order to build this infrastructure.
And that is backed by the...
presumed appreciation of the land which should go up as its infrastructure goes up.
Right.
I mean, that sounds actually very similar to the problem you were describing with Japan.
Well, you say, look, this is debt that's on the local government's balance sheets.
Right.
But, I mean, fundamentally, I don't know, it's like one country, and especially if you have this system of
The government can just like hand the debt to somebody.
So how immune is the private economy really from like, will it just take down the local government and then nobody will make a fuss and these other companies will just never have to hear about it again?
So my previous guest, Victor Shi, I think estimated that the local government debt
debt alone is somewhere between 100% and 150% of GDP.
And you add up to the central government debt.
And I think he estimated that government debt in China is 200% of GDP.
Do we know what the total debt, private and public, is in the US?
Why is it the case that middle-income countries
are especially in a bad position with having a high debt-to-GDP ratio because you would think naively that they're in a position to have higher growth over coming years.
And so it makes sense for them to take more debt so they can use that to finance the greater growth that will come, at least theoretically, assuming they use it to invest in high-return things, as opposed to the U.S.
or Japan where you're just going to be stuck with this debt load for a long time.
All right, look at this.
This is just part of an O-1 visa application.
Lighthouse will handle all of this for you.
Earlier this year, I wanted to hire somebody who lives in the UK to come work with me full-time in San Francisco.
He's incredibly talented, but he's had a very non-linear career trajectory.
Because of this, we didn't even know where to begin when it came to immigration.
Lighthouse laid out all the options and helped us understand what was possible.
After we decided to pursue the O-1 visa, they navigated this whole insane convoluted process for us.
It was shockingly easy on our end.
This experience also showed me that for people working on ambitious technical problems like AI, robotics, synthetic biology, I guess podcasts,
There is a way.
Lighthouse only needs your resume or even just your LinkedIn to tell you your options.
And they respond to every single person who fills out their contact form.
I really encourage you to reach out.
You may be more eligible for a visa than you realize.
Go to lighthousehq.com slash get started to learn more.
All right, back to Arthur.
I want to keep asking you more questions about the nitty gritty of the situation.
But before we keep deep diving, I want to step back and ask this question.
I think the valence of things you've said so far has been, look, they've been remarkably competent, even at the things which economists criticize them most for.
For the majority of the period of these schemes, they've actually worked out quite well.
For example, this local government financing through land sales on estimated future income and so forth.
And then obviously we were talking about Chinese industry and how that's been successful in many key sectors.
So I guess a big picture question I have is, if that's true, look, I mean, they're still at like a fifth of American national income per capita, a third of similar countries in East Asia, like Japan, Taiwan, and South Korea.
So how do we explain...
the relative poverty on a per capita basis in China, how do we explain obviously bad decisions like zero COVID?
Right.
Yeah, I guess I'm having trouble squaring the circle of like, you know, they're making all these great calls.
Yes.
Why isn't China more successful?
Is it your quote in the book, something like, at a country of China scale, what matters not is efficiency of using the resources, but effectiveness of achieving outcomes?
I have a question just to react really fast to some of the things you said.
Obviously, there's places in China like Shanghai or Guangdong, which I have per capita incomes, approaching these East Asian neighbors that are really rich.
And to the extent that that's possible through this high-tech manufacturing...
And to the extent that it's already a problem that other countries are complaining that China is doing too much high-tech manufacturing.
I guess it makes sense then, like, well, how could you have literally 10x as many people doing the same thing?
I was going to ask, you said the government has decided to prioritize this high-tech development rather than the growth rate.
And you said it as if there's a trade-off.
But naively, it seems like, isn't high-tech...
supposed to lead to growth?
Why are we on this Pareto frontier with technological development and growth?
I guess I still don't understand.
So point taken that maybe in the short run, these other sectors don't generate as much growth as just building or housing or something.
But by that same token, they don't take that much capital in the scheme of a national economy to sustain.
So if China wants to do $100 billion fund for semiconductors,
That shouldn't detract from its ability.
I just don't understand why the macro growth number has to go down for it to be able to do that $100 billion semiconductor fund or something.
that the potential growth was constrained not because it was cutting off against high-tech development in other more physical fields, but because it was a threat to political power or to perceived social stability.
So to the extent that growth has gone down because of all these actions...
Not to the extent.
It has gone down because of these actions.
And it doesn't seem like it's happened because they needed that to happen in order to, you know, get Smick up to snuff.
Right, correct.
It just seems like it's because... No, it's a political choice.
And there's another problem with this focus on specific parts of technology development, which is that if you – in any field you understand well enough, you realize sort of how contingent and random the steps leading up to what we now consider to be this –
sort of like self-contained high-tech thing is.
The field I know better than others is AI.
And just, there's just so many weird things, right?
That like people wanted to play video games.
And so we had decades of progress in making these graphics processing units.
Right.
Which contributed to AI development or people were just like posting on Reddit.
Right.
On Twitter and so forth.
And that fossil fuel of data has been powering these AI models.
And so the idea that you could have just like said in 2000, we want to make progress towards AGI.
And somehow that would have like led somewhere as opposed to what actually ended up happening.
Another thing to consider when discussing the virtue of such a system is you can't just look at the results over the course of 20 years or 30 years and say, well, since they only pick the key technologies over this period, therefore this system is preferable.
It's somewhat similar to looking at one successful dictator and say, because they made the right calls, dictatorship is the right model.
Right.
I mean, there have been many other countries which like, I actually don't know that much about like what Japan was doing from the 40s to the 70s.
But I assume they like needy or whatever made some right calls about which electronics and cars and so forth are important.
But, you know, to judge like how plausible is it that this level of competence and foresight and luck or whatever mixture they're in will continue into AI, what happens after that, whatever is required as an input into that.
Yeah.
I mean, I guess speaking of which, let me ask you about that.
Yeah, that's interesting.
What is the story in which China doesn't win AI?
Because in terms of talent of AI researchers themselves, it's a big country with lots of smart people.
That will be there.
We've already seen examples with High Flyer and DeepSeek where they can come close to the frontier
In terms of chips, eventually SMIC will be able to produce the kinds of H100 equivalents that the 5 nanometer process at TNCC can produce.
I don't know how many years away it is, and export controls might have still been net good, but this is not something that will not happen within the next 5, 10 years.
And then it's a matter of scaling, scaling production of the chips, scaling energy, scaling maybe people and data collection.
Right.
And what is China good at?
It's scaling, right?
And I think the numbers on energy production are just like absolutely staggering.
Yes.
What is the time interval in which China adds an America-sized amount of power generation?
It's like...
Extrapolate that forward.
And why that's important is so far we've been focused especially on training of AI models.
Right.
When these models start becoming super economically productive, basically by the point at which they are at human level, the more important question will be how many can you deploy, right?
Right.
Why is the reason China is such a powerful country or America is such a powerful country?
A big part of the reason is we just have more people.
Yes.
That's why China could take over Taiwan.
It's just like it has a billion people and Taiwan has on the order of what, 20 million?
20, 23 million or so, yeah.
So now you just have like more AIs, more AI people, if you have more power, which is ultimately what's upstream.
Okay, anyways, that was all just preface for asking, in 2035, what is the story of how China doesn't absolutely dominate AI?
A previous guest I just had on, Victor Shi, said that the Chinese government might be reluctant to let AI development go full speed because of the destabilizing effects it might have on the political system or the inability potentially to control it.
What's your sense on what will happen as AI becomes more economically valuable, a focal point in discussions about technological development and so forth?
Ironically, this is the one place, I mean, to the extent that China's advantages lie in its ability, like in situations where centralization is important, they can do it.
Ironically, this might be a case where
Centralization might be helpful and they might not do it because there are these tremendous economies of scale in AI training and you got to keep increasing the training cost by 4x every single year.
Right.
the availability of these advanced chips is lower in China, it would be even more helpful to have one person who can use them all for training.
This might be the situation where it would make more sense for the government to say, Huawei and ByteDance, you have to give all your chips to a high-flying.
Interesting.
That actually – that's a significant –
statement and maybe updates me downwards on the probability of China winning in AI.
Yeah, especially given the availability of compute in China.
I think one thing that will be important as this AI stuff is happening is to have something that is equivalent to the right telephone.
Right.
I do buy some of the crazy or scary stories about what could happen with AI.
And not necessarily because, like, you know, God takes over, but more so just, like, it's happened many times in human history that some more coordinated group has managed to do some sort of coup or slowly take over control.
So the kind of thing Cortez did or Pizarro did.
And I think a key way that can be prevented, or you can reduce the risk of that, is different human groups should be in tight contact with each other and have the same kind of mechanism that vaccination creates with diseases.
So the key advantage that the Spanish had over...
the new world empires was that the Spanish knew how each previous conquest had gone, but the Incas and the Aztecs didn't know about what strategies were used, how horses work, how steel works.
So how that translates here is I think there might be some crazy things these AIs tried, especially if you have this bifurcated world, which I think is actually very valuable from an alignment and safety perspective because you have this independent experiment or independent lineage.
If their AI tries to do something crazy, I think it's very important that they feel that they can tell us and we do the exact opposite.
Long preamble to ask the question of how do we set up that kind of high trust or prerequisite understanding on these kinds of issues such that this kind of- That is a terrific question.
The red telephone point you made is actually really interesting.
It didn't occur to me before that AI is such a diffuse thing.
It's like saying we're going to have a red telephone out of the Industrial Revolution.
But on the point about the ability of both sides to perceive each other,
It is true, as you say, that because maybe China has become more closed off, their perception of what's happening in America has been diminished.
I would argue that the opposite is even more true, that our understanding of China is more limited than theirs.
And a big part of this is just that every country in the world...
has some understanding of what's happening in America because of the overwhelming cultural significance, the charismatic nature of American politics.
And everybody knows who the main senators and cabinet people are all around the world.
When I visited China,
six months ago, it was shocking to me that you could go to cities with 20 million people like Chongqing and Chengdu and you would, I don't think I saw a single white person in Chongqing, which is insane, right?
Just like you're literally seeing seas of people and none of them are from the West.
And in fact, that was what motivated me that trip to do these kinds of episodes.
I think people are right to say, well, look, fundamentally, I read a book like this and I'm learning a lot about how China tangibly works.
I go there, I'm going to chat up taxi drivers.
I'm not going to learn about whether they're going to invade Taiwan by chatting up taxi drivers, right?
The thing I guess you miss is this more...
The kinds of things, I guess, which should be obvious and probably aren't obvious to these Congress people that are cavalier about what information they get, certainly wasn't obvious to me, at least on a sort of more subliminal level, is just how big the country is.
I mean, I think people talk about it as if it's like a small thing, like, oh, China's over there and we're going to deal with it.
Especially when you think about like, we're going to change its government, we're going to constrain its global impact or something.
It's big.
It is really big.
I worry that this sort of cycle of escalation will seem in retrospect to people, especially if it leads to a hot conflict, as sort of mystifying as World War I seems to us now.
Okay, if it's instigated by something in Taiwan, that's a different story.
There you can just tell a very direct through line.
But you look back at World War I and it's like, why did Germany do what it did?
Oh, it was worried about encirclement.
Why was it worried about encirclement?
Well, there's this weird thing with like the Russian, whatever, ambassador didn't get back in time.
And it's just like, it would just be this kind of thing where it's like, why did we have to have this adversarial relationship with China again?
And explain it to me when we're 50 years removed and the sort of day-to-day news is a nice salient.
I think people focus on, there is this interesting volatility in how people think China is doing, where some piece of news will come out about electric vehicles, and people will be like, China will obviously dominate us.
And then maybe some economic data will come out and say, oh, China is collapsing.
Right.
And I think people in their own countries have the sense that a lot of things are happening.
Some things are going well, some things are not going well.
But in like 20 years, America is not going to collapse.
That's right.
And it's also not going to like have destroyed every other country.
Right.
It's going to be like.
And it's just like there's like a long run trend.
That's right.
Right.
Whether China has like 1% higher growth rate or 1% lower, it's just going to be a powerful nation of the technological frontier.
And there's no sort of like very immediate implication of what exact growth trajectory they are on with respect to the most important questions about how we should engage with them and so forth.
You first went to China in 1980 and you've been- 85, 85.
85.
And you've been visiting and living there on and off ever since.
Right.
I mean, we're all aware that obviously China has developed a lot since then and so forth.
But what are some non-obvious changes you've seen or maybe even things that have stayed the same through all this change?
Yeah.
I think the broader point you're making here is important for it to emphasize in the sense of I think sometimes in –
If people are hawkish on China, they have the sense like they're doing everything wrong.
They're this aggressive belligerent power.
You know, they're like, this is the new Stalin or Hitler.
But conversely,
when people say that there should be, I don't know what the right word is, there should be a productive relationship.
I think that then there's also this other attractor state where people are just so uncomfortable with cognitive dissonance that they say like, and you can see how well their system works, central planning works, you know, like,
authoritarianism works.
You just see those two patterns of correlation so often that it's worth emphasizing that you can think you can have a productive relationship with the country and still think that not only is authoritarianism morally wrong, but it actually has had a bad impact on growth or on the day-to-day life of people and their cultural or whatever else.
I think people sometimes have trouble just holding two thoughts into their head at the same time.
Right.
Final question.
Right now, obviously, the US and China are engaged in this negotiation to figure out some somewhat stable coming back from Liberation Day.
What is the most positive, plausible story of what comes out of this?
All right.
That's a good note to close on.
Arthur, thanks so much for being on the podcast.
Thank you for tuning in.
I'll see you on the next one.
Today, I'm speaking with Ken Rogoff, who is a professor at Harvard, author recently of Our Dollar, Your Problem, former chief economist at the IMF.
Ken, thanks so much for coming on the podcast.
In your book, you have a lot of anecdotes of meeting different Chinese leaders, especially when you were chief economist at the IMF.
And it seems like you had positive experiences.
You met the premier with your family, and he would listen to your advice.
So, one, how does that inform your view about how competent their leadership is?
And two, how do you think they got into this mess with their big stimulus or whatever else you think went wrong to the extent that when you were talking to them in the early 2000s, it seemed like, you know, you were kind of seeing eye to eye or they would understand your perspective.
Do you think something changed in the meantime?
By the way, on that talk, so you mentioned in the book that before that you had to clear your talk, and so you gave them a sort of like—
watered down version of what you would say.
I have to say that would take gusto to go up to the top party leaders.
Were you nervous while you were giving the talk and you're like, oh, it's too centralized?
Yeah.
I mean, I think a lot of people in that situation usually-
Even though they should or the logic makes sense.
But I think people often don't.
So you've said that the seeds of their current crisis were sown in 2010 with their big stimulus.
So is it wrong then to blame Xi Jinping for this?
It was before his time under Hu Jintao that they did this stimulus that's causing all these problems now, right?
I was there six months ago.
So where did you go?
Shanghai, Beijing, Chongqing, Chengdu, Hangzhou, and Amisham.
Hangzhou?
When I was in China, we visited a town of half a million people outside of Chengdu, so one of these tier three cities.
And arriving there, I mean, the train station is huge.
Compounds are huge.
Even when you're driving around, like a movie theater is this like humongous complex of
And I realized things are bigger in China.
I was used to that because I'd seen these other cities by that point.
But I just thought, I've seen cities of half a million people.
I live in a city of half a million people in San Francisco, right?
Things are, this just doesn't seem proportionate to the size of the population.
And then we visited a Buddhist temple that had been built recently as a tourist site.
And it was like ginormous.
You would go through one like a little shrine and then behind it would be an even bigger like structure.
And then another one, concentrically for, I don't know, eight turns.
like it would take you probably 10 minutes to drive through this thing.
And there was just nobody there.
It was like me and three other white people.
If it hadn't been for financial repression, and suppose all this investment had been done through purely market mechanisms, would things have turned out much better?
I mean, even today, say today China gets rid of all financial repression, they save a lot, right?
So this money has to go somewhere.
Are there enough productive opportunities to soak up all these savings?
Or could there have been in the past, like...
If they get rid of financial repression, is this problem solved or could it have been solved?
Going back to your point about is purchasing power parity the right way to compare or is nominal the right way to compare?
I think in the book you say the nominal comparison of GDP is better because you can't buy Patriot missiles or oil with purchasing power parity dollars.
But if we're trying to compare the strength of the two countries, the relative strength, especially in a military context, if they can build ships for much cheaper and munitions for much cheaper and they ought to pay their soldiers less—
Isn't that actually more relevant if we were trying to figure out who would win in a war?
So shouldn't we be actually looking at the fact that they have a bigger PPP economy than us as a sign that they're actually stronger?
What is your projection?
So right now, I think their nominal GDP per capita, sorry, GDP is 75% of America's or something like that.
What's your projection by 2030 and by 2040?
The ratio.
Wait, that means you think they will actually never have a bigger economy than us?
I mean, the 1% per year compression is actually an extremely bearish forecast because even people who are pessimistic about China will say, oh, by 2040, they'll be like 150% or 125% of U.S.
nominal.
They think it'll be bigger, but it'll only be slightly bigger.
And so the fact that you think even by 2040, they won't have caught up is actually very bearish.
When Vercel first started working with massive enterprises like GitHub, eBay, and IBM, they realized that these customers couldn't even begin using their platform without single sign-on.
So what did Vercel do?
Well, they could decide to build it themselves, but that would take months.
And plus, they'd need a bunch of other features like SCIM, RBAC, audit logs, and much more.
Instead, they've reached for a work OS.
the platform that makes apps enterprise-ready.
With WorkOS, Vercel was able to ship SSO super fast and continue their hypergrowth.
And they're not the only one.
Today, WorkOS powers over a thousand top companies, including AI giants like OpenAI, Anthropic, and Perplexity.
If you're a startup, time is your most precious asset.
And WorkOS is like a time machine that lets you get to enterprise ready faster.
Learn more at WorkOS.com.
Going back to the subject of your book,
So people who are trying to predict when and how China will invade or blockade Taiwan will look at satellite photos of different docks and see how many ships are.
So they'll look at military preparedness.
From a monetary perspective, are there signs that we could be watching for?
For example, if they think that...
a lot of their American-dominated assets will get sanctioned or they won't have access to them.
Could we see them liquidating them?
Or would there be any sort of preparations that we could see on the monetary side that would let us know that they're doing something big that they're preparing for?
What is the alternative Rails they would build?
look like?
Are they buying oil from Iran in RMB?
Will other countries that they need things from accept that?
In 2030, what is their goal?
Okay, so let's talk about Japan, which you also cover in the book or their crisis.
And you blame the U.S.
's pressure in advance of that crisis on the Japanese to raise the value of their currency, the actions by the Bank of Japan.
Zooming out, how much of the crisis is not caused by things like that, but just the fact that high-tech manufacturing as a share of world output was becoming less important?
There's demographic factors as well.
And so something like this was sort of bound to happen to Japan, even if there wasn't some big crisis that preceded it.
South Korea's GDP per capita isn't that high either.
So...
at least in comparison to the US.
So yeah, how much of this is like actions taken by specific actors versus- I mean, South Korea has had a crisis in 1983 and 1997.
And what is the counterfactual?
So suppose that crisis hadn't happened.
How much wealthier is Japan today than it might have ever been?
Asking somebody who obviously doesn't know the details at like a high level, how would you explain to a novice, like basically...
How could a country be 50% less wealthy than it otherwise might have been simply from financial crises?
Because whatever they could have otherwise produced, why can't they still not produce it?
A country is like, they're producing a bunch of things.
Why are they producing 50% less things because of a financial crisis a couple of decades ago?
But there's a lot of economic models where, you know, Solow catch up.
Just to put it into context, what do you think the counterfactual wealth of America looks like without 2008 today?
I think this updates me towards the view that financial crisis are even worse than I think.
Like, it isn't just this bad thing that happens and you recover.
If there's 15% lingering even after, what, almost 20 years, then I think they're just like, oh, wow, this is huge.
Publicly available data is running out, so major AI labs like Meta, Google DeepMind, and OpenAI all partner with Scale to push the boundaries of what's possible.
Through Scale's data foundry, major labs get access to high-quality data to fuel post-training, including advanced reasoning capabilities.
Scale's research team, SEAL, is creating the foundations for integrating advanced AI into society through practical AI safety frameworks and public leaderboards around safety and alignment.
Their latest leaderboards include Humanity's Last Exam, Enigma Eval, Multi-Challenge, and Vista, which test a range of capabilities from expert-level reasoning to multimodal puzzle solving to performance on multi-turn conversations.
Scale also just released Scale Evaluation, which helps diagnose model limitations.
Leading frontier model developers rely on Scale Evaluation to improve the reasoning capabilities of their best models.
If you're an AI researcher or engineer and you want to learn more about how Scale's data foundry and research lab can help you go beyond the current frontier of capabilities, go to scale.com slash thwarkash.
So you see in the book that you expect there to be another spike in inflation within the next decade and also the fiscal position in the United States doesn't seem sustainable.
If you go forward 10 years, 20 years, when we do hit this
when the piper comes calling?
What actually happens?
Is it going to be some acute crisis like happened in Greece?
Are we going to have some sort of lost decade kind of thing like Japan?
What will happen?
Yeah.
So just for the audience, there's four ways we could get out of the debt.
We could default, which you don't think is likely.
But really good for my book.
Well, I mean, already you timed this one so well.
I'm going to be shorting the market when your next one comes out.
Yeah.
Financial repression.
I guess you could actually cut the deficit or inflation.
But Japan was using its own currency and it didn't default.
You say in the book that...
We didn't outgrow our World War II debt.
What happened instead was that financial repression after World War II and then the inflation of the 70s made what our debt to GDP ratio then should have been like 70 something and it was like 20 something instead.
And of course, we had inflation recently.
So do you think there's some irrationality in the market for U.S.
government debt already, given the fact that we can forecast what's going to happen here?
They can read your book and say that inflation is going to go up and the debt they're holding will be worth less.
They can look through history at what's happened.
So do you think there's just some irrationality in terms of what people are doing?
Yeah, though I wonder if from the politician's perspective, the independence of the Fed gives them some sort of way to pass the buck that they're actually happy about.
So they'd be like, oh, I'd love to do this irresponsible thing, but I can't because the Fed is out of my hands.
?
They could.
It does seem like the Fed works really well as it exists now, right?
It's like independent.
There's people, as you say, who criticize its actions.
But...
On the whole, it seems like a reliable institution which makes smart calls.
They can be wrong, of course, but it seems just so much more competent than much of the rest of government.
And if you wanted to replicate how the Fed works, if you wanted other parts of the government to work this way, is there something we could do?
Or is it just like maybe it's more so a human capital problem rather than an independence problem?
Like bankers and economists are really smart.
And I don't know if you could replicate that in the...
education department or the agriculture department or something.
Right.
But if you wanted to do that, suppose you get called by the Pentagon tomorrow and you said, we want to run the Pentagon like the Fed.
What do you tell them to do?
Before Trump, maybe for intrinsic reasons, maybe because of norms, it was really hard to fire people anyways.
And that didn't produce remarkable competence across the government.
So actually, let me see if I can consolidate some of the things you mentioned.
Maybe it's really important that there's – maybe we should structure more of the government that, you know, if you're running this department, you have this one target that's similar to the Fed's 2% inflation target.
And that's all you have to do.
Don't worry about anything else.
I do think it's impressive even – so the fact that the Fed has avoided the mission creep, it seems like every institution in the world does fall into mission creep.
Companies, government departments.
Apart from the political pressure problems from the outside,
Watching as you do, as you were mentioning, that your younger colleagues or the people writing the working papers at the Fed or the younger economists are writing about these other issues like inequality or climate change.
Even from the inside, basically, given what the younger people in this profession care about, do you still expect...
whether you want to call it the competence or the focus or whatever word you want to use, that to just decline by default given the new generation?
So, but why are you optimistic about when they get in charge?
So going back to the future problems, potentially, if we do go the financial repression route and not the inflation route, how bad will that be?
As you were saying, look, after World War II, we had financial repression, but that was when we had the highest growth ever.
On the other hand, China and Japan, it seems like a lot of their problems might be caused by...
the misallocation of capital caused by financial repression.
So do you have some intuition about how much we could screw ourselves over with that route as opposed to inflation?
Does that mean that US growth would have been even higher after World War II if we had just kept the government debt or figured out some other way to deal with it, but we had led financial markets develop earlier?
And just to make sure we've completed the concrete scenario, so basically your prediction is there will be some crisis, some surge of inflation, then there will be austerity, and then what happens?
Is growth really slow afterwards because government can't spend as much?
What do the next few decades look like in your world?
On the growth thing, Europe's growth has been pretty bad after 2010.
Japan obviously has had pretty bad growth after their crisis.
So why will we be in a different position if we do have this kind of crisis?
Why will growth continue apace?
Last week, I was trying to fix some admin settings for an important software tool that we use here at my podcast.
And I just couldn't figure it out.
I tried talking to different LLMs.
Of course, I tried checking their docs online.
It's one of those tasks that should take two minutes, but then you check back 30 minutes later and you still haven't finished it or made any progress.
I finally figured out what to do by chatting with Gemini's live API.
I started a voice chat with Gemini, I shared my screen, and I talked through what I was trying to do.
It showed me what to click, guided me through a few pages of settings, and 90 seconds later, I was done.
It was like having a remote coworker who knows a lot about everything, and you can just tap them on the shoulder and get them to help you with your task.
Often, LLMs aren't helpful in day-to-day tasks because it's clunky to give them all the context around your problems.
Gemini Live's combination of voice, video, and screen sharing made it helpful to me right away.
Gemini Live API is now available in preview for developers to build real-time conversational experiences.
You can try it out today by going to ai.dev, click stream, and start talking.
All right, back to Kenneth.
Is it possible to believe both that AGI is near and that America's fiscal position is untenable?
Like, what do you mean by saying AGI is coming?
Any job that can be done purely through computers is automated.
So white-collar work, the work we do even, is automated within 20 years.
If AI is going to be massively deflationary, if it makes all these goods so much cheaper, should we be printing a bunch of money to still stick to 2% inflation?
Or does that not matter anymore?
Right, but should they be trying to fight the deflation at all in that world?
Because, you know, we need inflation to, like, root out the rentiers.
We've got to fight downward wage rigidity.
But now the AIs have all the jobs, so you don't need to worry about that.
There's a bunch of biases that humans have, which to get rid of, we need inflation.
Do we even need that in the world with AI?
Do you expect interest rates to go up?
Because one factor, obviously, is you want to invest in the future.
The future has so much more potential.
Another is maybe you want to consume more now because you know you're going to be wealthy in the future anyways.
You might as well start spending as much as you can right now.
Well, let's talk about it a little bit.
I mean, if we expect interest rates to go up because of AI, what should the government be doing right now to –
be ready for that?
Should they be locking in 100-year bonds at the current interest rates because they're only going up from here?
You say in the book that you expect a rebalancing from U.S.
equities to foreign equities.
U.S.
equities have been outpacing foreign stock for the last couple of decades.
And you say you expect there's a change or there's some rebalancing.
What is it that causes that?
Is it because you're predicting, does the S&P keep growing at 8% but foreign equities do even better?
But not because the growth in US equity slows down and just that they do even better or is it that US equity slow down?
If you look internationally, if you had been betting on catch-up, I wonder if you'd backtest it.
Because there's some intuition, well, if you're poorer than the frontier, it makes sense that it would be easy for you to catch up.
There's another intuition that if you've been persistently behind the frontier, there must be some deep endogenous reason why you... Yeah, no, you're absolutely right.
Is there some institutional reform we could make that would get us out of this political equilibrium we're stuck in, where both parties, when they're in power, are incentivized to increase the debt and there's no institutional reform?
check on that proclivity?
But you think that would help?
I mean, I think if anything, if you are longer in office, you might have a sort of more long-term incentive.
I mean, to the extent that a lot of the deficit problems are caused by populism, I don't know how much campaign finance will help.
I don't know.
You're the former chief economist of the IMF.
Well, all right.
It was not your job.
If you think people are underrating how big the debt issue is, are you, especially long countries which have a low debt-to-GDP ratio, like Australia or Norway-
Do you think we were put this exorbitant privilege that you talk about?
Is it possible that...
One way in which it's bad for us is that it allows or incentivizes us to take on more debt than it's wise to.
And especially if it's not a permanent advantage we have that, you know, like...
When you're at the top, you take out this cheap debt, and then over time you lose your reserve status, or it weakens at least, and you have to refinance that debt at higher interest rates.
And so in the short term, you're incentivizing this behavior, which is not sustainable in the long run.
So...
Is there some political economy explanation for why this is bad for us?
This is a very naive question, and I know you address it at length in your book, but just to get a very like...
I'll ask the question in the most straightforward way, and then you can explain what's going on.
How should I think about the fact that we are basically giving the rest of the world pieces of paper, and we're getting real goods and services in exchange?
Sure, at the high level, you can say that they're getting this liquidity or they're getting this network, and that's what makes it worth it.
But I don't know.
Are we fundamentally getting away with something?
There's a really interesting book by Charles Mann called 1493 about how during Ming Dynasty China in the 17th century, they kept issuing different paper currencies and it was super unstable.
And so people in China wanted a reliable medium of exchange.
And so like tens of thousands of tons of silver from the New World, from the Spanish, were
would be exchanged for, you know, like, enormous amounts of real goods that were exported from China.
And so from the Spanish perspective, they're getting, like,
you know, shiploads and shiploads of real goods.
All they're giving up is this medium of exchange.
I don't know.
I don't know how analogous it is to this situation.
Final question.
A big part of your book discusses the different countries which seemed at different times to be real competitors to America.
You talk about the Soviet Union, Japan, China today.
And we've discussed why they didn't pan out.
And we can go into the details on any one of those examples.
But big picture...
Is there some explanation that generalizes across all these examples of why America has been so competitive or why it's been so hard to displace?
Well, it's a very scary kind of luck because if it's so easy for these other countries to make some mistake that causes them to totally fall behind, it should update you in the favor that in general, it's easy for a country to get itself in a rut.
It's sort of like the Fermi estimate thing of the fact that you don't see other alien civilizations is actually very scary because it suggests that there's some kind of filter, which makes it really hard to keep intelligent civilization going.
When I was in China, I met up with some venture capitalists there, and they were quite depressed in general.
And even founders say it's hard to raise money, and I was asking them why.
And they said that...
Investors don't want to invest because even if you invest in the next Alibaba, who's to say the government doesn't cancel the IPO?
Okay.
Thank you so much for sitting down with me and also answering all my – I know I'm sure there are many misconceptions and naive questions and so forth, so I appreciate your patience in educating me on this topic.
Yeah.
The honor is mine.
It was great to be able to travel here and speak with you.
Thank you for tuning in.
I'll see you on the next one.