Richard Sutton

You will say something and you will not get feedback about what the right thing to say is because there's no definition of what the right thing to say is.

246.14 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

There's no goal.

254.708 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And if there's no goal, then there's one thing to say, another thing to say.

256.189 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

There's no right thing to say.

259.993 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

So there's no ground truth.

262.155 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

You can't have prior knowledge if you don't have ground truth.

263.937 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Because the prior knowledge is supposed to be a hint or an initial belief about what the truth is.

267.5 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

But there isn't any truth.

272.95 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

There's no right thing to say.

275.353 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Now, in reinforcement learning, there is a right thing to say or a right thing to do because the right thing to do is the thing that gets you reward.

276.475 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

So we have a definition of what the right thing to do is.

284.246 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And so we can have prior knowledge or knowledge provided by people about what the right thing to do is.

286.89 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And then we can check it to see because we have a definition of what the actual right thing to do is.

293.52 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Now, an even simpler case is when you're trying to make a model of the world.

298.803 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

When you predict what will happen, you predict and then you see what happens.

303.371 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Okay, so there's ground truth.

307.798 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

There's no ground truth in large language models because you don't have a prediction about what will happen next.

309.602 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

If you say something in your conversation, the large language models have no prediction about what the person will say in response to that or what the response will be.

317.956 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Oh, no, they will respond to that question right.

335.684 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

But they have no prediction in the substantive sense that they won't be surprised by what happens.

339.37 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And if something happens that isn't what you might say they predicted, they will not change because an unexpected thing has happened.

344.863 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And to learn that, they'd have to make an adjustment.

352.721 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

I'm just saying they don't have, in any meaningful sense, they don't have a prediction of what will happen next.

388.304 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

They will not be surprised by what happens next.

395.554 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

They'll not make any changes if something happens based on what happens.

397.657 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

It's not what the world will give them in response to what they do.

411.949 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Let's go back to their lack of goal.

414.979 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

For me, having a goal is the essence of intelligence.

418.936 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Right.

422.661 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Something is intelligent if it can achieve goals.

423.202 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

I like John McCarthy's definition that intelligence is the computational part of the ability to achieve goals.

425.305 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Yeah.

431.794 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

So you have to have goals.

431.994 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

You're just a behaving system.

433.696 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

You're not anything special.

438.803 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

You're not intelligent.

440.946 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Right.

441.908 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And you agree that large language models don't have goals.

442.769 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

I think they have a goal.

445.713 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

What's the goal?

447.015 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Next second prediction.

447.856 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

That's not a goal.

449.598 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

It doesn't change the world.

450.9 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

You know, tokens come at you, and if you predict them, you don't influence them.

453.904 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Yeah, it's not a goal.

464.078 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

It's not a substantive goal.

466.862 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

You can't look at a system and say, oh, it has a goal if it's just sitting there predicting and being happy with itself that it's predicting accurately.

468.524 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Well, the math problems are different.

507.672 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Making a model of the physical world and carrying out the consequences of mathematical assumptions or operations, those are very different things.

511.336 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

The empirical world has to be learned.

522.347 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

You have to learn the consequences.

526.051 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Whereas the math is more just computational.

527.813 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

It's more like standard planning.

534.04 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

So there they can have a goal to find the proof.

536.403 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And they are in some way given that goal to find the proof.

545.574 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

It's an interesting question whether large language models are a case of the bitter lesson.

581.763 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Because they are clearly a way of using massive computation, things that will scale with computation up to the limits of the internet.

590.177 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

But they're also a way of putting in lots of

603.86 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

knowledge.

607.767 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And so this is an interesting question.

610.031 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

It's a sociological or industry question.

614.598 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Will they reach the limits of the data and be superseded by things that can get more data just from experience rather than from

618.504 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

from people.

633.468 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

In some ways, it's a classic case of the bitter lesson.

637.557 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

The more human knowledge we put into the large language models, the better they can do.

640.884 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And so it feels good.

645.554 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And yet, one, well, I in particular expect there to be systems that can learn from experience, which could well perform much, much better and be much more scalable, in which case it will be another instance of the bitter lesson that the things that used human knowledge were eventually superseded by things that just trained from experience and computation.

649.162 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Well, in every case of the bitter lesson, you know, you could start with human knowledge.

718.265 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Right.

724.479 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And then do the scalable things.

725.061 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Yeah.

726.765 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

That's always the case.

727.065 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And there's never any reason why that has to be bad.

728.428 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Right.

732.197 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

But in fact, and in practice, it has always turned out to be bad.

732.558 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Because people get locked into the human knowledge approach and they psychologically, or, you know, now I'm speculating why it is, but this is what has always happened.

736.928 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Yeah.

747.745 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

That, yeah, they get, their lunch gets eaten by the methods that are truly scalable.

748.326 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Yeah, give me a sense of what the scalable method is.

754.856 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

The scalable method is you learn from experience.

757.415 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

You try things, you see what works.

761.339 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

No one has to tell you.

765.723 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

First of all, you have a goal.

767.364 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

So without a goal, there's no sense of right or wrong or better or worse.

769.466 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

So large language models are trying to get by without having a goal or a sense of better or worse.

774.031 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

That's just, you know, it's exactly starting in the wrong place.

780.557 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

How old are these kids?

812.482 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

It's surprising.

844.152 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

You can have such a different point of view.

845.353 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

When I see kids, I see kids just trying things and waving their hands around and moving their eyes around.

848.036 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And no one tells them... There's no imitation for how they move their eyes around or even the sounds they make.

855.924 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

They may want to create the same sounds, but the actions, the thing that the...

864.473 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

The large language model is learning from training data.

898.422 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

It's not learning from experience.

901.488 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

It's learning from something that will never be available during its normal life.

904.633 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

There's never any training data that says you should do this action in normal life.

909.121 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Okay, I shouldn't have said never.

925.065 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

But I don't know.

927.007 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

I think I would even say it about school.

928.208 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

But formal schooling is the exception.

930.33 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Don't be difficult.

970.613 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

I mean, this is obvious.

973.077 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

So I don't think learning is really about training.

980.167 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

I think learning is about learning.

983.853 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

It's about an active process.

985.675 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

The child tries things and sees what happens.

987.738 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Right.

991.163 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Yeah, it does not.

991.644 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

We don't think about training when we think of an infant growing up.

995.009 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

These things are actually rather well understood.

999.115 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

If you go to look about how psychologists think about learning, there's nothing like imitation.

1001.819 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Maybe there are some extreme cases where humans might do that or appear to do that, but there is no basic animal learning process called imitation.

1008.709 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

The basic animal learning process is for prediction and for trial and error control.

1019.264 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

I mean, it's really interesting how sometimes the most hardest things to see are the obvious ones.

1025.373 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

It's obvious if you just look at animals and how they learn and you look at psychology and how our theories of them, it's obvious that supervised learning is not part of the way animals learn.

1030.739 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

We don't have examples of desired behavior.

1046.376 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

What we have is examples of things that happened, one thing that followed another.

1049.74 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And we have examples of we did something and there were consequences.

1054.691 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

But there are no examples of supervised learning.

1061.686 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Supervised learning is not something that happens in nature.

1064.732 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And, you know...

1067.839 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

School, even if that was the case, we should forget about it because that's some special thing that happens in people.

1069.282 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

It doesn't happen broadly in nature.

1077.015 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Squirrels don't go to school.

1080.54 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Squirrels can learn all about the world.

1081.842 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

It's absolutely obvious, I would say, that supervised learning doesn't happen in animals.

1084.166 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Why are you trying to distinguish humans?

1106.941 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Humans are animals.

1110.385 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

What we have in common is more interesting.

1112.532 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

What we have, what distinguishes us, we should be paying less attention to.

1114.756 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

So I like the way you consider that obvious because I consider the opposite obvious.

1135.108 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Yeah, I think we have to understand how we are animals.

1140.376 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And if we understood a squirrel, I think we'd be almost all the way there.

1145.97 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

It's understanding human intelligence.

1151.624 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

The language part is just a small veneer on the surface.

1153.669 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Okay, so this is great.

1158.871 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

You know, we're finding out the very different ways that we're thinking.

1160.514 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

We're not arguing.

1163.701 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

We're trying to share our different ways of thinking with each other.

1165.084 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

No, I think about it the same way.

1260.307 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

But still, it's a small thing on top of basic trial and error learning, prediction learning.

1263.012 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And that's what distinguishes us, perhaps, from many animals.

1270.625 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

But we're an animal first.

1277.337 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And we were an animal before we had language and all those other things.

1280.017 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Morphics.

1318.036 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Let's lay out a little bit about what it is.

1394.69 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

It says that experience, action, sensation, well, sensation, action, reward, and then this happens on and on and on, makes more life.

1397.153 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

It says that this is the foundation and the focus of intelligence.

1407.828 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Intelligence is about taking that stream and altering the actions to increase the rewards in the stream.

1412.738 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Right.

1421.836 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

So learning then is from the stream.

1421.976 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

and learning is about the stream.

1424.922 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

So that second part is particularly telling.

1427.244 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

What you learn, your knowledge, your knowledge is about the stream.

1431.769 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Your knowledge is about if you do some action, what will happen?

1436.895 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Or it's about which events will follow other events.

1440.979 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

It's about the stream.

1444.903 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

It's the content of the knowledge is statements about the stream.

1446.325 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And so because it's a statement about the stream, you can test it by comparing it to the stream and you can learn it continually.

1451.23 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

So when you're imagining this future continual learning agent.

1459.158 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

They're not future.

1463.002 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Of course, they exist all the time.

1463.803 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

This is what reinforcement learning paradigm is, learning from experience.

1465.805 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

The reward function is arbitrary.

1486.668 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And so if you're playing chess, it's to win the game of chess.

1489.593 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

If you're a squirrel, maybe the reward has to do with getting nuts.

1494.922 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Right.

1501.433 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

In general, for an animal, you would say the reward is to avoid pain and to acquire pleasure.

1504.338 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Right.

1512.252 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And there's also would be a component having to do with, I think there should be a component having to do with your increasing understanding of your environment.

1513.634 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

That would be sort of an intrinsic motivation.

1525.117 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

I don't like the word model when used the way you just did.

1571.887 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

I think a better word would be the network.

1575.412 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

So I think you mean the network.

1578.436 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Maybe there's many networks.

1581.46 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

So anyway, things would be learned and then you'd have copies and many instances.

1583.382 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And sure, you'd want to share knowledge across all.

1588.99 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

the instances.

1591.934 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And there would be lots of possibilities for doing that.

1593.937 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Like there is not today.

1596.34 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

You can't have one child grow up and learn about the world and then every new child has to repeat that process.

1597.302 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Whereas with AIs, with the digital intelligence, you could hope to do it once and then copy it into the next one as a starting place.

1605.714 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

So this would be a huge savings and I think actually it would be much more important than trying to learn from people.

1613.325 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

So this is something we know very well, and the basis of it is temporal difference learning, where the same thing happens in a less grandiose scale, like when you learn to play chess.

1663.925 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

The long-term goal is winning the game, and yet you want to be able to learn from shorter-term things, like taking your opponent's pieces.

1675.938 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And so you do that by having a value function, which predicts the long-term outcome.

1687.232 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And then if you take the guy's pieces, well, your prediction about the long-term outcome is changed.

1692.158 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

It goes up.

1697.644 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

You think you're going to win.

1698.125 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And then that increase in your belief changes.

1699.366 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

immediately quote reinforces the uh the move that led to taking the piece okay so we have this long-term 10-year goal of making a startup and making a lot of money and so when we make progress we say oh i'm i'm i'm more likely to uh achieve the long-term goal and that rewards the the steps along the way

1702.55 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

I think the crux of this, and I'm not sure, but...

1762.804 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

The big world hypothesis seems very relevant, and the reason why humans become useful on their job is because they are encountering the particular part of the world, and it can't have been anticipated, and it can't all have been put in in advance.

1768.953 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

The world is so huge that you can't... The dream, as I see it, the dream of large language models is you can teach the agent everything and it will know everything and it won't have to learn anything online.

1787.161 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

right during its life okay and and your examples are all well really you have to because you can there's a lot to you can teach it but there's all little idiosyncrasies of the particular life they're leading and the the particular people they're working with and what they like as opposed to what average people like right and so that's just saying the world is really big and so you're going to have to learn it uh along the way

1802.247 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And I'm- So I would say you're just doing regular learning.

1860.473 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Maybe using context, because in large language models, all that information has to go into the context window.

1865.36 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Right.

1870.951 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

But in a continual learning setup, it just goes into the weights.

1871.492 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Maybe, yeah, so maybe context is the wrong word to use, because I mean a more general thing.

1875.179 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

You learn a policy that's specific to the environment that you're finding yourself in.

1879.046 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

So maybe we're trying to ask the question of, it seems like the reward is too small of a thing to do all the learning that we need to do.

1899.212 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

But, of course, we have the sensations, right?

1906.54 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

We have all the other information we can learn from.

1910.264 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Right.

1912.987 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

We don't just learn from the reward.

1914.028 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

We learn from all the data.

1915.249 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

So now I want to talk about the base common model of the agent with the four parts.

1923.378 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Right.

1930.386 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

So we need a policy.

1931.006 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

The policy says...

1932.988 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

In the situation I'm in, what should I do?

1934.53 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

We need a value function.

1936.414 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

The value function is the thing that is learned with TD learning.

1938.097 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And the value function produces a number.

1941.845 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

The number says, how well is it going?

1943.608 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And then you watch if that's going up and down and use that to adjust your policy.

1946.354 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Okay, so those two things.

1951.49 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And then there's also the perception component, which is the construction of your state representation, your sense of where you are now.

1953.894 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And the fourth one is what we're really getting at, most transparently anyway.

1963.249 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

The fourth one is the transition model of the world.

1967.496 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

That's why I am uncomfortable just calling everything models, because I want to talk about the model of the world.

1971.042 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

the transition model of the world your belief that if you do this what will happen what will be the consequences of what you do so your physics of the world but it's all it's not just physics it's also um abstract models like you know your model of how you traveled um from california up to edmonton for this podcast that was a model and that's a transition model and that would be uh learned and it's not learned from reward it's learned from you did things you saw what happened yeah you made that model of the world

1976.11 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

That will be learned very richly from all the sensation that you receive, not just from the reward.

2002.975 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

It has to include the reward as well, but that's a small part of the whole model, small crucial part of the whole model.

2010.498 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

The idea is totally general.

2073.209 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

I do use all the time, as my canonical example, the idea of an AI agent is like a person.

2075.552 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And people, in some sense, they have just one world they live in.

2082.783 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And that world may involve chess and it may involve Atari games.

2089.132 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

But those are not a different task or a different world.

2094.66 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Those are different states that they encounter.

2097.344 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Right.

2099.527 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And so the general idea is not limited at all.

2099.888 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

They just set it up.

2117.067 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

It was not their ambition to have one agent across those games.

2118.49 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

If we want to talk about transfer, we should talk about transfer, not across games or across tasks, but transfer between states.

2126.104 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

We're not seeing transfer anywhere.

2148.749 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

We're not seeing general... Critical to good performance is that you can generalize well from one state to another state.

2151.272 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

We don't have any methods that are good at that.

2158.141 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

What we have are people try different things and they settle on something that a representation that transfers well or that generalizes well.

2160.545 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

But we don't have any automated techniques to promote.

2171.725 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

We have very few automated techniques to promote transfer.

2176.072 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And none of them are used in modern deep learning.

2179.839 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

The researchers did it.

2202.381 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Because there's no other explanation.

2203.702 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Gradient descent will not make you generalize well.

2205.865 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

It will make you solve the problem.

2208.309 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

It will not make you get new data.

2210.331 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

you generalize in a good way.

2214.267 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Generalization means train on one thing that affects what you do on the other things.

2216.049 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

So we know deep learning is really bad at this.

2220.996 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

For example, we know that if you train on some new thing, it will often catastrophically interfere with all the old things that you knew.

2223.019 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

So this is exactly bad generalization.

2230.709 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Right.

2234.635 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Generalization, as I said, is some kind of

2234.775 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

influence of training on one state on other states.

2237.825 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And generalization is not necessarily good or bad.

2241.271 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Just the fact that you generalize is not necessarily good or bad.

2243.716 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

You can generalize poorly, you can generalize well.

2246.321 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

So generalization always will happen, but we need algorithms that will cause the generalization to be good rather than bad.

2249.246 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Well, large language models, so complex.

2310.888 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

We don't really know what information they had prior.

2314.112 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

We have to guess because they've been fed so much.

2317.716 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

This is one reason why they're not a good way to do science.

2322.541 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

It's just so uncontrolled, so unknown.

2327.106 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

But if you come up with an entirely new... They're getting a bunch of things right, perhaps.

2330.35 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And so the question is why?

2335.763 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Well, it may be that they don't need to generalize to get them right because the only way to get some of them right is to form something which gets all of them right.

2337.528 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Right.

2346.45 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

So if there's only one answer and you find it, that's not called generalization.

2346.55 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

It's the only way to solve it, and so they find the only way to solve it.

2353.299 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Generalization is when it could be this way, it could be that way, and they do it the good way.

2357.064 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Well, there's nothing in them which will cause it to generalize well.

2393.868 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Creating dissent will cause them to find a solution to the problems they've seen.

2399.076 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And if there's only one way to solve them, they'll do that.

2404.704 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

But there are many ways to solve it, some which generalize well, some which generalize poorly.

2407.688 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

There's nothing in them, in the algorithms, that will cause them to generalize well.

2411.594 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

But people, of course, are involved.

2416.361 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And if it's not working out, they fiddle with it.

2417.863 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Right.

2421.569 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

until they find a way, perhaps until they find a way which it generalizes well.

2422.37 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Okay.

2524.066 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

So, yeah, I...

2525.09 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

thought a little bit about this.

2527.957 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

There are many things, or a handful of things.

2529.72 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

First, the large language models are surprising.

2533.345 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

It's surprising how effective artificial neural networks are at language tasks.

2536.25 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

That was a surprise.

2545.163 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

It wasn't expected.

2546.265 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Language seemed different.

2547.627 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

So that's impressive.

2549.049 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

There's a long-standing controversy in AI about simple basic principle methods, the general-purpose methods like search and learning, compared to human-enabled systems like symbolic methods.

2551.413 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

So in the old days, it was interesting because things like search and learning were called weak methods because they just use general principles.

2572.76 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

They're not using the power that comes from imbuing a system with human knowledge.

2579.399 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

So those were called strong and weak.

2584.754 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And so I think the weak methods have just totally won.

2587.762 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

That's the biggest question from the old days of AI.

2594.157 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

What would happen?

2600.211 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Learning and search have just won the day.

2602.376 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

But there's a sense which that was not surprising to me because I was always voting for or hoping or rooting for the simple basic principles.

2605.143 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And so even with the large language models, it's surprising how well it worked, but it was all good and gratifying.

2613.996 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And things like AlphaGo, it's sort of surprising how well that was able to work.

2621.447 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And AlphaZero in particular, how well it was able to work.

2628.717 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

But it's all very gratifying because, again, it's simple basic principles are winning the day.

2632.722 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

So the whole AlphaGo thing has a precursor, which is TD Gammon.

2666.385 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Jerry Tesoro did exactly that.

2670.931 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

reinforcement learning, temporal difference learning methods to play backgammon.

2675.157 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And it beat the world's best players.

2681.226 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And it worked really well.

2684.811 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And so in some sense, AlphaGo was merely a scaling up of that process.

2685.772 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

It was quite a bit of scaling up, and there was also an additional innovation in how the search was done.

2691.44 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

But it made sense.

2698.35 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

It wasn't surprising in that sense.

2699.572 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

AlphaGo actually didn't use TD Learning.

2701.735 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

It waited to see the final outcomes.

2706.581 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

But AlphaZero used TD and AlphaZero was applied to all the other games and did extremely well.

2709.325 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

I've always been very impressed by the way AlphaZero plays chess because I'm a chess player and it just

2716.735 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

It just sacrifices material for sort of positional advantages.

2723.504 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And it's just content and patient to sacrifice that material for a long period of time.

2727.651 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And so that was surprising that it worked so well, but also gratifying and fitting into my worldview.

2734.221 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Yeah.

2742.214 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

So this has led me where I am.

2743.636 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Where I am is I'm in some sense a contrarian or thinking differently from the field is.

2745.499 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And I am personally just kind of content being out of sync with my field for a long period of time, perhaps decades, because occasionally I have improved right in the past.

2751.649 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And the other thing I do to help me not feel I'm out of sync and thinking in a strange way is to look not at my local environment or my local field, but to look back in time and into history and to see what people have thought classically about the mind in many different fields.

2765.152 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And I don't feel I'm out of sync with the larger traditions.

2786.418 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

I really view myself as a classicist rather than as a contrarian.

2790.382 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

I go to what the larger community of thinkers about the mind have always thought.

2794.227 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

You want to presume that it's been done.

2860.613 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Well, but you're using it to get AGI again.

2876.279 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

So these AGIs, if they're not superhuman already, then the knowledge that they might impart would be not superhuman.

2883.428 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

I'm not sure your idea makes sense because it seems to presume the existence of AGI.

2894.407 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And then we've already worked that out.

2900.918 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And the way AlphaZero was an improvement was it did not use the human knowledge, but just went from experience.

2934.357 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Right.

2941.731 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

So why do you say bring in other agents' expertise to teach it when it's worked so well from experience and not by help from another agent?

2942.453 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Right.

2956.6 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

I think more interesting is just think about that case.

2988.17 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Which when you have many AIs, will they help each other?

2992.036 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

the way cultural evolution works in people.

2997.284 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Let's just, maybe we should talk about that.

3000.831 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

The bitter lesson, oh, who cares about that?

3003.536 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

That's an empirical observation about a particular period in history.

3005.239 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

70 years in history, no longer, doesn't necessarily have to apply the next 70 years.

3008.726 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

So the interesting question is, you're an AI, you get some more computer power.

3013.535 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Should you use it to make yourself more computationally capable?

3017.423 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Or should you use it to spawn off a copy of yourself to go learn something interesting on the other side of the planet or on some other topic and then report back to you?

3021.211 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Yep.

3029.529 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

I think that's a really interesting question that will only arise in the age of digital intelligences.

3030.899 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

I'm not sure what the answer is, but I think it will... More questions.

3039.63 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Will it be possible to really spawn it off, send it out, learn something new, something perhaps very new, and then will it be able to be reincorporated into the original?

3043.335 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Or will it have changed so much that...

3053.609 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

It can't really be done.

3057.494 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Is that possible or is it not?

3059.778 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And you can carry this to its limit, as I saw one of your videos the other night that suggested that it could, where you spawn off many, many copies, do different things, highly decentralized, but report back to the central master.

3061.581 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And that this will be such a powerful thing.

3076.765 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Well, I think one thing that, so this is my attempt to add something to this view, is that a big question, a big issue will become corruption.

3079.57 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

You know, if you really could just get information from anywhere and bring it into your central mind, you could become more and more powerful.

3091.48 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And it's all digital and they all speak some internal digital language.

3098.016 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Maybe it'll be easy and possible, but...

3103.99 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

it will not be that easy, as easy as you're imagining, because you can lose your mind this way.

3107.258 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

If you pull in something from the outside and build it into your inner thinking, it could take over you.

3113.046 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

It could change you.

3120.276 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

It could be your destruction rather than your increment in knowledge.

3121.378 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

I think this will become a big concern, particularly when you're, oh, he's figured all about how to play some new game or figured out he studied Indonesia and you want to incorporate that into your mind.

3127.306 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Yeah.

3139.991 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

So you think, oh, just read it all in.

3142.336 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And that'll be fine.

3144.856 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

But no, you've just read a whole bunch of bits into your mind.

3145.938 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And they could have viruses in them.

3150.005 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

They could have hidden goals.

3153.451 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

They can warp you and change you.

3155.655 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And this will become a big thing.

3158.48 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

How do you have cybersecurity in the age of digital spawning and reforming again?

3160.243 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Yeah, so I do think succession to digital or digital intelligence or augmented humans is inevitable.

3240.79 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

So the argument, I have a four part argument.

3251.929 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Step one is,

3254.834 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

there's no government or organization that gives humanity a unified point of view that dominates and that can arrange.

3256.737 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

There's no consensus about how the world should be run.

3268.333 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And number two, we will figure out how intelligence works.

3271.077 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Researchers will figure it out eventually.

3276.605 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And number three, we won't stop

3278.668 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Just with human-level intelligence, we will reach superintelligence.

3280.43 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And number four is that it's inevitable over time that the most intelligent things around would gain resources and power.

3284.578 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And, uh, so put all that together, it's, you know, you, um, it's sort of inevitable that you're going to have, um, succession to AI or to AI enabled augmented humans.

3296.461 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

So within those, those four things seem clear and, and, and sure to happen.

3310.623 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Uh, but within that set of possibilities, some, there can be good outcomes as well as less good outcomes, bad outcomes.

3316.612 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And, um,

3324.604 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

So I'm just trying to be realistic about where we are and ask how we should feel about it.

3326.701 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Right.

3347.997 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And so then I do encourage people to think positively about it, first of all, because it's something we humans have always tried to do for thousands of years, tried to understand themselves, trying to make themselves think better.

3348.418 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And...

3360.432 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

you know, just understand themselves.

3363.472 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

So this is a great success as science, humanities.

3365.216 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

We're finding out what this essential part of humanness is, what it means to be intelligent.

3371.007 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And then what I usually say is that this is all kind of human-centric.

3378.423 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

What if we look, you step aside from being a human and just say, take the point of view of the universe.

3383.554 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And this is, I think, a major stage in the universe, a major transition, a transition from replicators, humans and animals,

3389.287 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

plants we're all replicators and that gives us some strengths and some limitations and then we're entering the age of design where because our ai's are designed our our our all of our physical objects are designed our buildings are designed our our technology is designed and we're we're designing now uh ai's things that can be intelligent themselves and that are themselves capable of design and so this is this is a key step in the world and in the universe and i think

3398.707 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

So it's the transition from the world in which most of the interesting things that are, are replicated.

3428.693 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Replicated means you can make copies of them, but you don't really understand them.

3437.329 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Like right now we can make more intelligent beings, more children, but we don't really understand how intelligence works.

3441.837 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Right.

3447.949 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Whereas we're reaching now to having design intelligence, intelligence that we do understand how it works, and therefore we can change it in different ways and at different speeds than otherwise.

3448.049 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And our future, they might not be replicated at all.

3461.571 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

We may just design AIs, and those AIs will design other AIs, and everything will be done by design and construction rather than by replication.

3465.177 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Yeah, I mark this as one of the four great stages of the universe.

3475.353 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

First there's dust, ends of stars, and then stars make planets, and the planets give rise to life, and now we're giving life to designed entities.

3480.76 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And so I think we should be proud that we are giving rise to this great transition in the universe.

3493.316 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Yeah, so it's an interesting thing.

3503.534 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Should we consider them part of humanity or different from humanity?

3505.657 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

It's our choice.

3510.724 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

It's our choice whether we say, oh, they are our offspring and we should be proud of them and we should celebrate their achievements.

3511.906 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Or we could say, oh, no, they're not us and we should be horrified.

3517.935 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

It's interesting that it feels to me like a choice, and yet it's such a strongly held thing that how can we be a choice?

3522.181 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

I like these sort of contradictory implications of thought.

3531.213 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

So are you thinking like maybe we are like the Neanderthals who give rise to Homo sapiens.

3560.251 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Maybe Homo sapiens will give rise to a new group of people.

3565.618 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Well, I think it's relevant to point out that for most of humanity, they don't have much influence on what happens.

3606.573 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Most of humanity doesn't influence who can control the atom bomb.

3619.761 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

or who controls the nation states.

3624.832 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Even as a citizen, I often feel that we don't control the nation states very much.

3629.998 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

They're out of control.

3636.185 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

A lot of it has to do with just how you feel about change.

3636.986 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And if you think the current situation is really, really good, then you're more likely to be suspicious of change and averse to change than if you think...

3641.632 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

it's imperfect.

3652.76 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And I think it's imperfect.

3654.342 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

In fact, I think it's pretty bad.

3655.945 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

So I'm open to change.

3658.589 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And I think humanity has had a super good track record.

3661.874 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And maybe it's the best thing that there's been, but it's far from perfect.

3666.761 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

We should be concerned about our future, the future.

3707.656 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

We should try to make it good.

3712.321 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

We also, though, should recognize the limits, our limits.

3714.083 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And I think we want to avoid the feeling of entitlement, avoid the feeling, oh, we're here first.

3719.831 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

We should always have it in a good way.

3727.681 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

How should we think about the future and how much control a particular species on a particular planet should have over it?

3730.364 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And how much control do we have?

3740.037 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

You know, a counterbalance to our limited control over the long-term future of humanity should be how much control do we have over our own lives?

3742.66 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Like we have our own goals and we have our families and those things are much more controllable than like trying to control the whole universe.

3753.814 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

So I think it's appropriate for us to really work towards our own local goals.

3766.43 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And it's kind of aggressive for us saying, oh, the future has to evolve this way that I want it to.

3776.668 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Sure.

3782.339 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Because then we'll have arguments.

3783.16 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Different people think the future, the global future should evolve in different ways.

3784.783 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And they have conflict.

3788.409 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

So you're saying we're trying to design the future and the principles by which it will evolve and come into being.

3894.119 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Right.

3901.546 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And so you're saying the first thing you're saying is, well, we will, we try to teach our children general principles which will promote

3902.207 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

more likely evolutions.

3911.521 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Maybe we should also seek for things being voluntary.

3913.585 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

If there is change, we want it to be voluntary rather than imposed on people.

3917.173 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

I think that's a very important point.

3921.743 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And yeah, that's all good.

3925.291 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

I think this is like a big, you know, the big...

3926.313 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

the big or one of the really big human enterprises to design society.

3931.298 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And that's been ongoing for thousands of years again.

3936.546 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

And so it's like the more things change, really the more things, they stay the same.

3939.891 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

We still have to figure out how to be.

3944.579 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

The children will still come up with different values that seem strange to their parents and their grandparents and things will evolve.

3946.622 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Okay.

3977.584 View full episode →

Dwarkesh Podcast

Richard Sutton – Father of RL thinks LLMs are a dead-end

Thank you very much.

3978.345 View full episode →

Podcast Appearances

Login Required