Menu
Sign In Pricing Add Podcast
Podcast Image

Lex Fridman Podcast

#426 – Edward Gibson: Human Language, Psycholinguistics, Syntax, Grammar & LLMs

Wed, 17 Apr 2024

Description

Edward Gibson is a psycholinguistics professor at MIT and heads the MIT Language Lab. Please support this podcast by checking out our sponsors: - Yahoo Finance: https://yahoofinance.com - Listening: https://listening.com/lex and use code LEX to get one month free - Policygenius: https://policygenius.com/lex - Shopify: https://shopify.com/lex to get $1 per month trial - Eight Sleep: https://eightsleep.com/lex to get special savings Transcript: https://lexfridman.com/edward-gibson-transcript EPISODE LINKS: Edward's X: https://x.com/LanguageMIT TedLab: https://tedlab.mit.edu/ Edward's Google Scholar: https://scholar.google.com/citations?user=4FsWE64AAAAJ TedLab's YouTube: https://youtube.com/@Tedlab-MIT PODCAST INFO: Podcast website: https://lexfridman.com/podcast Apple Podcasts: https://apple.co/2lwqZIr Spotify: https://spoti.fi/2nEwCF8 RSS: https://lexfridman.com/feed/podcast/ YouTube Full Episodes: https://youtube.com/lexfridman YouTube Clips: https://youtube.com/lexclips SUPPORT & CONNECT: - Check out the sponsors above, it's the best way to support this podcast - Support on Patreon: https://www.patreon.com/lexfridman - Twitter: https://twitter.com/lexfridman - Instagram: https://www.instagram.com/lexfridman - LinkedIn: https://www.linkedin.com/in/lexfridman - Facebook: https://www.facebook.com/lexfridman - Medium: https://medium.com/@lexfridman OUTLINE: Here's the timestamps for the episode. On some podcast players you should be able to click the timestamp to jump to that time. (00:00) - Introduction (10:53) - Human language (14:59) - Generalizations in language (20:46) - Dependency grammar (30:45) - Morphology (39:20) - Evolution of languages (42:40) - Noam Chomsky (1:26:46) - Thinking and language (1:40:16) - LLMs (1:53:14) - Center embedding (2:19:42) - Learning a new language (2:23:34) - Nature vs nurture (2:30:10) - Culture and language (2:44:38) - Universal language (2:49:01) - Language translation (2:52:16) - Animal communication

Audio
Featured in this Episode
Transcription

0.169 - 20.61 Lex Fridman

The following is a conversation with Edward Gibson, or Ted, as everybody calls him. He is a psycholinguistics professor at MIT. He heads the MIT Language Lab that investigates why human languages look the way they do, the relationship between cultural language and how people represent, process, and learn language.

0
💬 0

21.431 - 47.311 Lex Fridman

Also, he should have a book titled Syntax, A Cognitive Approach, published by MIT Press, coming out this fall. So look out for that. And now a quick few second mention of each sponsor. Check them out in the description. It's the best way to support this podcast. We got Yahoo Finance for basically everything you've ever needed if you're an investor. Listening for listening to research papers.

0
💬 0

47.411 - 70.056 Lex Fridman

Policy Genius for insurance. Shopify for selling stuff online. And Eight Sleep for naps. Choose wisely, my friends. Also, if you want to work with our amazing team or just get in touch with me, go to lexfriedman.com contact. And now, onto the full ad reads. As always, no ads in the middle. I try to make this interesting, but if you must skip friends, please still check out the sponsors.

0
💬 0

70.256 - 93.643 Lex Fridman

I enjoyed their stuff, maybe you will too. This episode is brought to you by Yahoo Finance, a new sponsor. And they got a new website that you should check out. It's a website that provides financial management reports, information, and news for investors. Yahoo itself has been around forever. Yahoo Finance has been around forever. I don't know how long, but it must be over 20 years.

0
💬 0

94.383 - 117.881 Lex Fridman

It survived so much. It evolved rapidly and quickly, adjusting, evolving, improving, all of that. The thing I use it for now is there's a portfolio that you can add your account to. Ever since I had zero money, I used, boy, I think it's called TD Ameritrade. I still use that same thing, just getting a basic mutual fund account.

0
💬 0

118.361 - 140.812 Lex Fridman

And I think TDM MetaTrade got bought by Charles Schwab or acquired or merged. I don't know. I don't know how these things work. All I know is that Yahoo Finance can integrate that and just show me everything I need to know about my quote-unquote portfolio. I don't have anything interesting going on, but it is still good to kind of monitor it, to stay in touch.

0
💬 0

142.699 - 157.65 Lex Fridman

Now, a lot of people I know have a lot more interesting stuff going on investment-wise, so all of that could be easily integrated into Yahoo Finance, and you can look at all that stuff, the charts, blah, blah, blah. It looks beautiful and sexy and just helps you be informed.

0
💬 0

158.23 - 177.864 Lex Fridman

Now, that's about your own portfolio, but then also for the entirety of the finance information for the entirety of the world. That's all there. the big news, the analysis of everything that's going on, everything like that. And I should also mention that I would like to do more and more financial episodes. I've done a couple of conversations with Ray Dalio.

0
💬 0

178.625 - 197.88 Lex Fridman

A lot of that is about finance, but some of that is about sort of geopolitics and the bigger context of finance. I just recently did a conversation with Bill Ackman, very much about finance. And I did a series of conversations on cryptocurrency. Lots and lots of brilliant people, Michael Saylor, so on.

0
💬 0

198.741 - 221.927 Lex Fridman

Charles Hoskinson, Vitalik, I mean just lots of brilliant people in that space thinking about the future of money, future of finance. Anyway, you can keep track of all of that with Yahoo Finance. For comprehensive financial news and analysis, go to yahoofinance.com. That's yahoofinance.com. This episode is also brought to you by Listening, an app that allows you to listen to academic papers.

0
💬 0

222.507 - 244.087 Lex Fridman

It's a thing I've always wished existed, and I always kind of suspected it's very difficult to pull off, but these guys pulled it off. Basically, it's any kind of formatted text brought to life through audio. Now for me, the thing I care about most, and I think that's at the foundation of listening, is academic papers.

0
💬 0

244.668 - 265.561 Lex Fridman

So I love to read academic papers, and there's several levels of rigor in the actual reading process, but listening to them, especially after I skimmed it, or after I did a deep dive, listening to them is just such a beautiful experience. It solidifies the understanding. It brings to life all kinds of thoughts.

0
💬 0

265.941 - 289.209 Lex Fridman

And I'm doing this while I'm cooking, while I'm running, while I'm going to grab a coffee, all that kind of stuff. It does require an elevated level of focus, especially the kind of papers I listen to, which are computer science papers. But you can load in all kinds of stuff. You can do philosophy papers. You can do psychology papers like this. Very topic of linguistics.

0
💬 0

289.249 - 310.735 Lex Fridman

I've listened to a few papers on linguistics. I went back to Chomsky and listened to papers. It's great. Papers, books, PDFs, webpages, articles, all that kind of stuff. Even email newsletters. And the voices they got are pretty sexy. It's great. It's pleasant to listen to. I think that's what's ultimately most important is it shouldn't feel like a chore to listen to it. Like I really enjoy it.

0
💬 0

311.735 - 335.589 Lex Fridman

Normally you'd get a two week free trial, but listeners of this podcast get one month free. So go to listening.com slash Lex. That's listening.com slash Lex. This episode is brought to you by Policy Genius, a marketplace for insurance, life, auto, home, disability, all kinds of insurance. There's really nice tools for comparison. I'm a big fan of nice tools for comparison.

0
💬 0

336.089 - 363.999 Lex Fridman

Like I have to travel to harsh conditions soon, and I have to figure out how I need to update my equipment to make sure it's weatherproof, waterproof even. It's just resilient to harsh conditions. And it would be nice to have sort of comparisons. I have to resort to like Reddit posts or forum posts kind of debating different audio quarters and cabling and microphones and...

0
💬 0

365.18 - 389.729 Lex Fridman

and waterproof containers, all that kind of stuff. I would love to be able to do a rigorous comparison of them. Of course, going to Amazon, you get the reviews, and those are actually really, really solid. And so I think Amazon's been the giant gift to society in that way, that you kind of can lay out all the different options and get a lot of structured analysis of how good Amazon is.

0
💬 0

390.049 - 414.312 Lex Fridman

this thing is, so Amazon's been great at that. Now, what Policy Genius did is did the Amazon thing, but for insurance, so the tools for comparison is really my favorite thing. It's just really easy to understand. The full marketplace of insurance. With Policy Genius, you can find life insurance policies that start at just $292 per year for $1 million of coverage.

0
💬 0

414.633 - 434.556 Lex Fridman

Head to policygenius.com slash Lex or click the link in the description to get your free life insurance quotes and see how much you can save. That's policygenius.com slash Lex. This episode is also brought to you by Shopify, a platform designed for anyone to sell anywhere with a great looking online store.

0
💬 0

435.277 - 455.613 Lex Fridman

I'm not name dropping here, but I recently went on a hike with the CEO of Shopify, Toby, he's brilliant. I've been a fan of his for a long time, long before Shopify was a sponsor. I don't even know if he knows that Shopify sponsors this podcast. Now, just to clarify, it really doesn't matter.

0
💬 0

456.113 - 476.954 Lex Fridman

Nobody in this world can put pressure on me to have a sponsor or not to have a sponsor or for a sponsor to put pressure on me what I can and can't say. I, when I wake up in the morning, feel completely free to say what I want to say and to think what I want to think. I've been very fortunate in that way in many dimensions in my life.

0
💬 0

477.655 - 494.933 Lex Fridman

And I also have always lived a frugal life and a life of discipline, which is where the freedom of speech and the freedom of thought truly comes from. So I don't need anybody. I don't need a boss. I don't need money. I'm free to exist in this world in the way I want. sees right.

0
💬 0

495.033 - 514.535 Lex Fridman

Now, on top of that, of course, I'm surrounded by incredible people, many of whom I disagree with and have arguments, so I'm influenced by those conversations and those arguments and I'm always learning, always challenging myself, always humbling myself. I have kind of intellectual humility. I kind of suspect I'm kind of an idiot.

0
💬 0

516.397 - 543.949 Lex Fridman

I start my approach to the world of ideas from that place, assuming I'm an idiot and everybody has a lesson to teach me. Anyway, not sure why I got off that tangent, but the hike was beautiful. Nature, friends, is beautiful. Anyway, I have a Shopify store, lexfriedman.com slash store. It's very minimal, which is how I like, I think, most things. If you want to set up a store, it's super easy.

0
💬 0

543.969 - 564.727 Lex Fridman

It takes a few minutes. Even I figured out how to do it. Sign up for a $1 per month trial period at shopify.com slash lex. That's all lowercase. Go to shopify.com slash lex to take your business to the next level today. This episode is also brought to you by Eight Sleep, and it's part of the three cover. The source of my escape.

0
💬 0

565.167 - 596.542 Lex Fridman

The door, when opened, allows me to travel away from the troubles of the world into this ethereal universe of calmness. A cold bed surface with a warm blanket. a perfect 20 minute nap, and it doesn't matter how dark the place my mind is in, a nap will pull me out, and I see the beauty of the world again. Technologically speaking, a-sleep is just really cool. You can control temperature with a nap.

0
💬 0

596.942 - 621.542 Lex Fridman

It's become such an integral part of my life that I've begun to take it for granted. Typical human. So the app controls the temperature. I set it, currently I'm setting it to a negative five. And it's just super nice, cool surface. It's something I really look forward to, especially when I'm traveling. I don't have one of those. It really makes me feel like home.

0
💬 0

622.623 - 656.563 Lex Fridman

Check it out and get special savings when you go to asleep.com slash Lex. This is the Lex Freeman Podcast. To support it, please check out our sponsors in the description. And now, dear friends, here's Edward Gibson. When did you first become fascinated with human language?

0
💬 0

657.418 - 674.573 Edward Gibson

As a kid in school, when we had to structure sentences in English grammar, I found that process interesting. I found it confusing as to what it was I was told to do. I didn't understand what the theory was behind it, but I found it very interesting.

0
💬 0

674.713 - 678.996 Lex Fridman

So when you look at grammar, you're almost thinking about it like a puzzle, almost like a mathematical puzzle?

0
💬 0

679.136 - 697.435 Edward Gibson

Yeah, I think that's right. I didn't know I was going to work on this at all at that point. I was really just... I was kind of a math geek person, computer scientist. I really liked computer science. And then I found... Language is a neat puzzle to work on from an engineering perspective, actually.

0
💬 0

698.015 - 721.499 Edward Gibson

I sort of accidentally decided after I finished my undergraduate degree, which was computer science and math in Canada and Queen's University, I decided to go to grad school. That's what I always thought I would do. And I went to Cambridge, where they had a master's program in computational linguistics. And I hadn't taken a single language class before.

0
💬 0

721.539 - 735.69 Edward Gibson

All I'd taken was CS, computer science, math classes, pretty much, mostly, as an undergrad. And I just thought, oh, this was an interesting thing to do for a year, because it was a single-year program. And then I end up spending my whole life doing it.

0
💬 0

735.79 - 753.269 Lex Fridman

So fundamentally, your journey through life was one of a mathematician and a computer scientist, and then you kind of discovered the puzzle, the problem of language, and approached it from that angle. To try to understand it from that angle, almost like a mathematician or maybe even an engineer.

0
💬 0

753.609 - 777.568 Edward Gibson

As an engineer, I'd say, I mean, to be frank, I had taken an AI class, I guess it was 83 or 84, 85, somewhere 84 in there a long time ago. And there was a natural language section in there. And it didn't impress me. I thought there must be more interesting things we can do. It didn't seem very, it seemed just a bunch of... to me. It didn't seem like a real theory of things in any way.

0
💬 0

777.628 - 783.612 Edward Gibson

And so I just thought this seemed like an interesting area where there wasn't enough good work.

0
💬 0

783.772 - 807.51 Lex Fridman

Did you ever come across the philosophy angle of logic? So if you think about the 80s with AI, the expert systems where you try to kind of maybe sidestep the poetry of language and some of the syntax and the grammar and all that kind of stuff and go to the underlying meaning that language is trying to communicate and try to somehow compress that in a computer-representable way.

0
💬 0

808.151 - 809.912 Lex Fridman

Do you ever come across that in your studies?

0
💬 0

810.052 - 831.18 Edward Gibson

I mean, I probably did, but I wasn't as interested in it. I was trying to do the easier problems first, the ones I could, thought maybe were handleable, which seems like the syntax is easier, which is just the forms as opposed to the meaning. When you're starting to talk about the meaning, that's a very hard problem, and it still is a really, really hard problem. But the forms is easier.

0
💬 0

831.261 - 839.026 Edward Gibson

And so I thought at least figuring out the forms of human language, which sounds really hard, but is actually maybe more tractable.

0
💬 0

839.226 - 853.255 Lex Fridman

So it's interesting. You think there is a big divide. There's a gap. There's a distance between form and meaning. Because that's a question you discuss a lot with LLMs, because they're damn good at form.

0
💬 0

853.375 - 858.599 Edward Gibson

Yeah. I think that's what they're good at, is form. Exactly. And that's why they're good, because they can do form. Meaning's hard.

0
💬 0

859.761 - 884.846 Lex Fridman

Do you think there's, oh, wow. I mean, it's an open question, right? How close form and meaning are. We'll discuss it, but to me, studying form, maybe it's a romantic notion, gives you, form is like the shadow. of the bigger meaning thing underlying language. Language is how we communicate ideas. We communicate with each other using language.

0
💬 0

885.406 - 898.349 Lex Fridman

So in understanding the structure of that communication, I think you start to understand the structure of thought and the structure of meaning behind those thoughts and communication to me. But to you, big gap.

0
💬 0
0
💬 0

899.935 - 907.142 Lex Fridman

What do you find most beautiful about human language? Maybe the form of human language, the expression of human language.

0
💬 0

907.802 - 931.122 Edward Gibson

What I find beautiful about human language is some of the generalizations that happen across the human languages, within and across a language. So let me give you an example of something which I find kind of remarkable. That is if a language, if it has... a word order such that the verbs tend to come before their objects. And so that's like English does that.

0
💬 0

931.202 - 953.717 Edward Gibson

So we have the first, the subject comes first in a simple sentence. So I say, you know, the dog chased the cat or Mary kicked the ball. So the subject's first. And then after the subject, there's the verb. And then we have objects. All these things come after in English. So it's generally a verb. And most of the stuff that we want to say comes after the subject. It's the objects.

0
💬 0

953.737 - 976.393 Edward Gibson

There's a lot of things we want to say to come after. And there's a lot of languages like that. About 40% of the languages of the world look like that. They're subject-verb-object languages. And then these languages tend to have prepositions, these little markers on the nouns that connect words. Nouns to other nouns or nouns to verbs.

0
💬 0

977.174 - 1000.409 Edward Gibson

So a preposition like in or on or of or about, I say I talk about something. The something is the object of that preposition. These little markers come, just like verbs, they come before their nouns. So now we look at other languages like Japanese or Hindi. These are so-called verb final languages. Those...

0
💬 0

1001.289 - 1022.283 Edward Gibson

as about maybe a little more than 40%, maybe 45% of the world's languages or more, I mean, 50% of the world's languages are verb final. Those tend to be post positions. Those markers, they have the same kinds of markers as we do in English, but they put them after. So, sorry, they put them first, the markers come first.

0
💬 0

1022.303 - 1042.097 Edward Gibson

So you say, instead of, you know, talk about a book, you say a book about, the opposite order there in Japanese or in Hindi, you do the opposite. And the talk comes at the end. So the verb will come at the end as well. So instead of Mary kicked the ball, it's Mary ball kicked.

0
💬 0

1042.797 - 1061.12 Edward Gibson

And then if it says Mary kicked the ball to John, it's John to, the to, the marker there, the preposition, it's a postposition in these languages. And so the interesting thing, a fascinating thing to me is that within a language that this order aligns. It's harmonic.

0
💬 0

1062.18 - 1086.692 Edward Gibson

And so if it's one or the other, it's either verb initial or verb final, but then you'll have prepositions, prepositions, or postpositions. And that's across the languages that we can look at. We've got around 1,000 languages. There's around 7,000 languages on the earth right now. But we have information about, say, word order on around 1,000 of those, a pretty decent amount of information.

0
💬 0

1086.932 - 1101.223 Edward Gibson

And for those 1,000 which we know about, about 95% fit that pattern. So they will have either verb, it's about half and half, half are verb initial, like English, and half are verb final, like Japanese.

0
💬 0

1101.323 - 1109.848 Lex Fridman

So just to clarify, verb initial is subject, verb, object. That's correct, verb. verb final is still subject, object, verb.

0
💬 0

1109.888 - 1111.869 Edward Gibson

That's correct. Yeah, the subject is generally first.

0
💬 0

1111.969 - 1122.293 Lex Fridman

That's so fascinating. I ate an apple or I apple ate. Yes. Okay, and it's fascinating that there's a pretty even division in the world amongst those, 40, 45%.

0
💬 0

1122.573 - 1148.228 Edward Gibson

Yeah, it's pretty even. And those two are the most common by far. Those two words, the subject tends to be first. There's so many interesting things, but the thing I find so fascinating is there are these generalizations within and across a language. And there's actually a simple explanation, I think, for a lot of that. And that is you're trying to minimize dependencies between words.

0
💬 0

1148.409 - 1166.761 Edward Gibson

That's basically the story, I think, behind a lot of why word order looks the way it is, is we're always connecting. What is the thing I'm telling you? I'm talking to you in sentences. You're talking to me in sentences. These are sequences of words which are connected, and the connections are dependencies between the words.

0
💬 0

1167.681 - 1189.932 Edward Gibson

And it turns out that what we're trying to do in a language is actually minimize those dependency links. It's easier for me to say things if the words that are connecting for their meaning are close together. It's easier for you in understanding if that's also true. If they're far away, it's hard to produce that, and it's hard for you to understand. And the languages of the world,

0
💬 0

1190.572 - 1215.172 Edward Gibson

within a language and across languages fit that generalization. It turns out that having verbs initial and then having prepositions ends up making dependencies shorter. And having verbs final and having postpositions ends up making dependencies shorter than if you cross them. If you cross them, it's possible You can do it. You mean within a language? Within a language, you can do it.

0
💬 0

1215.292 - 1239.011 Edward Gibson

It just ends up with longer dependencies than if you didn't. So languages tend to go that way. They call it harmonic. So it was observed a long time ago, without the explanation, by a guy called Joseph Greenberg, who's a famous typologist from Stanford. He observed a lot of generalizations about how word order works, and these are some of the harmonic generalizations that he observed.

0
💬 0

1239.71 - 1249.581 Lex Fridman

harmonic generalizations about word-to-word. There's so many things I want to ask you. Okay, good. Let me just, sometimes basics. You mentioned dependencies a few times.

0
💬 0

1249.601 - 1249.922 Edward Gibson

Yeah, yeah.

0
💬 0

1250.142 - 1251.423 Lex Fridman

What do you mean by dependencies?

0
💬 0

1251.904 - 1273.635 Edward Gibson

Well, what I mean is in... In language, there's kind of three structures to, three components to the structure of language. One is the sounds. So cat is k, a, and t in English. I'm not talking about that part. I'm talking, then there's two meaning parts. And those are the words. And you were talking about meaning earlier. So words have a form and they have a meaning associated with them.

0
💬 0

1273.995 - 1296.723 Edward Gibson

And so cat is a full form in English and it has a meaning associated with whatever a cat is. And then the combinations of words, that's what I'll call grammar or syntax. And that's like when I have a combination like the cat or two cats, okay? So where I take two different words there and put them together and I get a compositional meaning from putting those two different words together.

0
💬 0

1297.503 - 1319.108 Edward Gibson

And so that's the syntax. And in any sentence or utterance, whatever I'm talking to you, you're talking to me, we have a bunch of words and we're putting together in a sequence. It turns out they are... connected so that every word is connected to just one other word in that sentence. And so you end up with what's called technically a tree.

0
💬 0

1319.148 - 1334.061 Edward Gibson

It's a tree structure where there's a root of that utterance of that sentence. And then there's a bunch of dependents, like branches from that root that go down to the words. The words are the leaves in this metaphor for a tree.

0
💬 0

1334.54 - 1336.902 Lex Fridman

So a tree is also sort of a mathematical construct.

0
💬 0

1336.923 - 1339.345 Edward Gibson

Yeah, yeah. It's a graph theoretical thing. It's a graph theory thing.

0
💬 0

1339.465 - 1347.353 Lex Fridman

Yeah, yeah. So it's fascinating that you can break down a sentence into a tree, and then every word is hanging on to another. It's depending on it.

0
💬 0

1347.373 - 1353.119 Edward Gibson

That's right. And everyone agrees on that. So all linguists will agree with that. Oh, so this is not a controversial thing? That is not controversial.

0
💬 0

1353.139 - 1358.524 Lex Fridman

There's nobody sitting here listening mad at you. I do not think so. I don't think so. Okay. There's no linguist sitting there mad at this.

0
💬 0

1358.564 - 1366.386 Edward Gibson

No, I think in every language, I think everyone agrees that all sentences are trees at some level. Can I pause on that? Sure.

0
💬 0

1366.406 - 1377.61 Lex Fridman

Because to me, just as a layman, it's surprising that you can break down sentences in all languages into a tree.

0
💬 0

1377.77 - 1385.193 Edward Gibson

I think so. I've never heard of anyone disagreeing with that. That's weird. The details of the trees are what people disagree about.

0
💬 0

1385.871 - 1392.757 Lex Fridman

Well, okay, so what's at the root of a tree? How do you construct? How hard is it? What is the process of constructing a tree from a sentence?

0
💬 0

1393.998 - 1410.163 Edward Gibson

Well, this is where, you know, depending on what your... There's different theoretical notions. I'm going to say the simplest thing, dependency grammar. It's like a bunch of people invented this. Tenier was the first French guy back in... I mean, the paper was published in 1959, but he was working on the 30s and stuff.

0
💬 0

1410.503 - 1435.138 Edward Gibson

And it goes back to, you know, philologist Pignini was doing this in ancient India, okay? And so, you know, doing something like this. The simplest thing we can think of is... that there's just connections between the words to make the utterance. And so let's just say I have like two dogs entered a room. Okay, here's a sentence. And so we're connecting two and dogs together.

0
💬 0

1435.478 - 1455.632 Edward Gibson

That's like, there's some dependency between those words to make some bigger meaning. And then we're connecting dogs now to entered, right? And we connect a room somehow to entered. And so I'm going to connect to room and then room back to entered. That's the tree. The root is entered. The thing is like an entering event. That's what we're saying here.

0
💬 0

1456.012 - 1478.203 Edward Gibson

And the subject, which is whatever that dog is, is two dogs, it was. And the connection goes back to dogs, which goes back to, then that goes back to two. I'm just, that's my tree. It starts at entered, goes to dogs, down to two. And on the other side, after the verb, The object, it goes to room, and then that goes back to the determiner or article, whatever you want to call that word.

0
💬 0

1478.223 - 1497.657 Edward Gibson

So there's a bunch of categories of words here we're noticing. So there are verbs. Those are these things that typically mark... They refer to events and states in the world. And there are nouns, which typically refer to people, places, and things is what people say. But they can refer to other more... I think you've heard of events themselves as well. They're marked by...

0
💬 0

1500.899 - 1511.862 Edward Gibson

The category, the part of speech of a word is how it gets used in language. That's how you decide what the category of a word is. Not by the meaning, but how it gets used. How it's used.

0
💬 0

1512.222 - 1516.403 Lex Fridman

What's usually the root? Is it going to be the verb that defines the event?

0
💬 0

1516.643 - 1517.904 Edward Gibson

Usually. Yes, yes.

0
💬 0

1518.364 - 1518.784 Lex Fridman

Okay.

0
💬 0

1518.864 - 1522.385 Edward Gibson

I mean, if I don't say a verb, then there won't be a verb, and so it'll be something else.

0
💬 0

1522.445 - 1532.909 Lex Fridman

What if you're messing... Are we talking about language that's like correct language? What if you're doing poetry and messing with stuff? then rules go out the window, right?

0
💬 0

1533.15 - 1554.962 Edward Gibson

No. You're constrained by whatever language you're dealing with. Probably you have other constraints in poetry. Usually in poetry, there's multiple constraints that you want to... You want to usually convey multiple meanings is the idea, and maybe you have a rhythm or a rhyming structure as well. But you usually are constrained by the rules of your language for the most part. So you don't...

0
💬 0

1555.682 - 1582.08 Edward Gibson

violate those too much. You can violate them somewhat, but not too much. So it has to be recognizable as your language. Like in English, I can't say, dogs two entered room ah. I mean, I meant, you know, two dogs entered a room. And I can't mess with the order of the articles and the nouns. You just can't do that. In some languages, you can mess around with the order of words much more.

0
💬 0

1582.1 - 1597.694 Edward Gibson

I mean, you speak Russian. Mm-hmm. Russian has a much freer word order than English. And so, in fact, you can move around words in, you know, I told you that English has the subject, verb, object word order. So does Russian. But Russian is much freer than English. And so you can actually mess around with the word order.

0
💬 0

1597.754 - 1604.761 Edward Gibson

So probably Russian poetry is going to be quite different from English poetry because the word order is much less constrained.

0
💬 0

1604.921 - 1625.651 Lex Fridman

Yeah, there's a much more extensive culture of poetry throughout the history of the last hundred years in Russia. And I always wondered why that is. But it seems that there's more flexibility in the way the language is used. You're morphing the language easier by altering the words, altering the order of the words, messing with it.

0
💬 0

1625.984 - 1645.692 Edward Gibson

Well, you can just mess with different things in each language. And so in Russian, you have case markers, which are these endings on the nouns, which tell you how it connects, each noun connects to the verb, right? We don't have that in English. And so when I say, Mary kissed John, I don't know who the agent or the patient is, except by the order of the words, right?

0
💬 0

1646.072 - 1664.585 Edward Gibson

In Russian, you actually have a marker on the end. If you're using a Russian name and each of those names, you'll also say, is it, you know, agent, it'll be the, you know, nominative, which is marking the subject, or an accusative will mark the object. And you could put them in the reverse order. You could put accusative first. You could put subject, you could put...

0
💬 0

1665.566 - 1680.414 Edward Gibson

the patient first, and then the verb, and then the subject. And that would be a perfectly good Russian sentence. And it would still mean, I could say John kissed Mary, meaning Mary kissed John, as long as I use the case markers in the right way. You can't do that in English.

0
💬 0

1680.814 - 1689.019 Lex Fridman

And so... I love the terminology of agent and patient and the other ones you used. Those are sort of linguistic terms, correct? Correct.

0
💬 0

1689.079 - 1707.854 Edward Gibson

Those are for kind of meaning. Those are meaning. And subject and object are generally used for position. So subject is just like the thing that comes before the verb, and the object is the one that comes after the verb. The agent is kind of like the thing doing it. That's kind of what that means, right? The subject is often the person doing the action, right? The thing.

0
💬 0

1708.454 - 1720.803 Lex Fridman

Okay, this is fascinating. So how hard is it to form a tree in general? Is there... Is there a procedure to it? Like, if you look at different languages, is it supposed to be a very natural, like, is it automatable, or is there some human genius involved?

0
💬 0

1721.143 - 1735.47 Edward Gibson

I think it's pretty automatable at this point. People can figure out what the words are. They can figure out the morphemes, which are the, technically, morphemes are the minimal meaning units within a language, okay? And so, when you say eats, Or drinks, it actually has two morphemes in English.

0
💬 0

1735.531 - 1752.654 Edward Gibson

There's the root, which is the verb, and then there's some ending on it which tells you, you know, that's the third person singular. Can you say what morphemes are? Morphemes are just the minimal meaning units within a language. And a word is just kind of the things we put spaces between in English. And they have a little bit more. They have the morphology as well.

0
💬 0

1752.674 - 1756.754 Edward Gibson

They have the endings, this inflectual morphology on the endings, on the roots.

0
💬 0

1757.094 - 1759.855 Lex Fridman

It modifies something about the word that adds additional meaning.

0
💬 0

1759.935 - 1778.774 Edward Gibson

Yeah, yeah, yeah. And so we have a little bit of that in English, very little. You have much more in Russian, for instance. But we have a little bit in English. And so we have a little on the nouns. You can say it's either singular or plural. And you can say the same thing for verbs. Like simple past tense, for example. So, you know, notice in English we say drinks.

0
💬 0

1779.575 - 1797.227 Edward Gibson

you know, he drinks, but everyone else is, I drink, you drink, we drink. It's unmarked in a way. And then, but in the past tense, it's just drank. For everyone, there's no morphology at all for past tense. There is morphology, it's marking past tense, but it's kind of, it's an irregular now. So we don't even, you know, drink to drank, you know, it's not even a regular word.

0
💬 0

1797.327 - 1815.595 Edward Gibson

So in most verbs, many verbs, there's an ed we kind of add. So walk to walked, we add that to say it's the past tense. I just happened to choose an irregular because the high frequency word and the High-frequency words tend to have irregulars in English. What's an irregular? Irregular is just, there isn't a rule. So drink to drank is an irregular.

0
💬 0

1815.855 - 1816.715 Lex Fridman

Drink, drank, okay.

0
💬 0

1816.735 - 1819.395 Edward Gibson

As opposed to walk, walked, talked, talked.

0
💬 0

1819.815 - 1822.176 Lex Fridman

And there's a lot of irregulars in English.

0
💬 0

1822.196 - 1832.778 Edward Gibson

There's a lot of irregulars in English. The frequent ones, the common words, tend to be irregular. There's many, many more low-frequency words, and those tend to be, those are regular ones.

0
💬 0

1832.958 - 1849.919 Lex Fridman

The evolution of the irregulars are fascinating, because it's essentially slang that's sticky, because you're breaking the rules, and then everybody uses it and doesn't follow the rules, and they say screw it to the rules. It's fascinating. So you said morphemes, lots of questions. So morphology is what, the study of morphemes?

0
💬 0

1850.518 - 1876.146 Edward Gibson

Morphology is the connections between the morphemes onto the roots. So in English, we mostly have suffixes. We have endings on the words, not very much, but a little bit, as opposed to prefixes. Some words, depending on your language, can have mostly prefixes, mostly suffixes, or both. And then even languages, several languages have things called infixes, where you have some kind of a general...

0
💬 0

1877.607 - 1883.352 Edward Gibson

form for the root, and you put stuff in the middle. You change the vowels. That's fascinating.

0
💬 0

1883.532 - 1891.499 Lex Fridman

That is fascinating. So in general, there's, what, two morphemes per word? Usually one or two? Or three?

0
💬 0

1891.539 - 1909.232 Edward Gibson

Well, in English, it's one or two. In English, it tends to be one or two. There can be more. In other languages, a language like English, Like Finnish, which has a very elaborate morphology, there may be 10 morphemes on the end of a root. And so there may be millions of forms of a given word.

0
💬 0

1909.592 - 1930.001 Lex Fridman

Okay, I'll ask the same question over and over. But... how does the, just sometimes to understand things like morphemes, it's nice to just ask the question, how does these kinds of things evolve? So you have a great book studying sort of the

0
💬 0

1933.332 - 1957.197 Lex Fridman

how the cognitive processing, how language is used for communication, so the mathematical notion of how effective language is for communication, what role that plays in the evolution of language, but just high level, like how does a language evolve where English is two morphemes or one or two morphemes per word and then Finnish has infinity per word? So how does that happen? Is it just...

0
💬 0

1958.477 - 1982.533 Edward Gibson

That's a really good question. That's a very good question. Why do languages have more morphology versus less morphology? I don't think we know the answer to this. I think there's just a lot of good solutions to the problem of communication. I believe, as you hinted, that Language is an invented system by humans for communicating their ideas.

0
💬 0

1982.573 - 2003.757 Edward Gibson

And I think it comes down to we label the things we want to talk about. Those are the morphemes and words. Those are the things we want to talk about in the world. And we invent those things. And then we put them together in ways that are easy for us to convey, to process. But that's like a naive view. And I don't, I mean, I think it's probably right, right? It's naive and probably right.

0
💬 0

2003.777 - 2029.191 Lex Fridman

Well, I don't know if it's naive. I think it's simple. Simple. I think naive is an indication that it's incorrect somehow. It's a trivial, too simple. I think it could very well be correct. But it's interesting how sticky. It feels like two people got together. It just feels like once you figure out certain aspects of a language, that just becomes sticky and the tribe forms around that language.

0
💬 0

2029.532 - 2035.896 Lex Fridman

Maybe the language, maybe the tribe forms first and then the language evolves. And then you just kind of agree and you stick to whatever that is.

0
💬 0

2036.137 - 2065.42 Edward Gibson

I mean, these are very interesting questions. We don't know really about how words, even words, get invented very much. Assuming they get invented, we don't really know how that process works and how these things evolve. What we have is... kind of a current picture of a few thousand languages, a few thousand instances. We don't have any pictures of really how these things are evolving, really.

0
💬 0

2065.76 - 2088.983 Edward Gibson

And then the evolution is massively confused by contact, right? So as soon as one language group, one group runs into another, We are smart. Humans are smart. And they take on whatever is useful in the other group. And so any kind of contrast which you're talking about, which I find useful, I'm going to start using as well.

0
💬 0

2089.023 - 2115.259 Edward Gibson

So I worked a little bit in specific areas of words, in number words and in color words. And in color... So we have, in English, we have around 11 words that everyone knows for colors. And many more if you happen to be interested in color for some reason or other. If you're a fashion designer or an artist or something, you may have many, many more words. But we can see millions.

0
💬 0

2115.359 - 2133.204 Edward Gibson

Like if you have normal color vision, normal trichrometric vision, you can see millions of distinctions in color. So we don't have millions of words. The most efficient, no, the most detailed color vocabulary would have over a million terms to distinguish all the different colors that we can see, but of course we don't have that.

0
💬 0

2133.605 - 2160.816 Edward Gibson

So it's somehow, it's been, it's kind of useful for English to have evolved in some way to, so there's 11 terms that people find useful to talk about, black, white, red, blue, green, yellow, purple, gray, pink, and I probably missed something there. Anyway, there's 11 that everyone knows. But you go to different cultures, especially the non-industrialized cultures, and there'll be many fewer.

0
💬 0

2160.916 - 2182.653 Edward Gibson

So some cultures will have only two, believe it or not. The Danai in Papua New Guinea have only two labels that the group uses for color. Those are roughly black and white. They are very, very dark and very, very light, which are roughly black and white. And you might think, oh, they're dividing the whole color space into light and dark or something. And that's not really true.

0
💬 0

2182.693 - 2203.43 Edward Gibson

They mostly just only label the black and the white things. They just don't talk about the colors for the other ones. And then there's other groups. I've worked with a group called the Chimani down in Bolivia in South America. And they have... three words that everyone knows, but there's a few others that several people, that many people know.

0
💬 0

2203.971 - 2222.968 Edward Gibson

And so they have, it's kind of depending on how you count, between three and seven words that the group knows, okay? And again, they're black and white. Everyone knows those. And red, red is, you know, like that tends to be the third word that everyone, that cultures bring in. If there's a word, it's always red, the third one.

0
💬 0

2223.268 - 2247.337 Edward Gibson

And then after that, it's kind of all bets are off about what they bring in. And so after that, they bring in a sort of a big blue-green group. They have one for that. And then different people have different words that they'll use for other parts of the space. And so anyway, it's probably related to what they want to talk... Not what they see, because they see the same colors as we see.

0
💬 0

2247.757 - 2268.121 Edward Gibson

So it's not like they have a weak... a low color palette in the things they're looking at. They're looking at a lot of beautiful scenery, okay? A lot of different colored flowers and berries and things. And so there's lots of things of very bright colors, but they just don't label the color in those cases.

0
💬 0

2268.221 - 2279.463 Edward Gibson

And the reason probably, we don't know this, but we think probably what's going on here is that what you do, why you label something is you need to talk to someone else about it. And why do I need to talk about a color

0
💬 0

2280.003 - 2296.113 Edward Gibson

Well, if I have two things which are identical and I want you to give me the one that's different and the only way it varies is color, then I invent a word which tells you, you know, this is the one I want. So I want the red sweater off the rack, not the green sweater, right? There's two.

0
💬 0

2296.233 - 2316.11 Edward Gibson

And so those things will be identical because these are things we made and they're dyed and there's nothing different about them. And so in industrialized society, we have You know, everything we've got is pretty much arbitrarily colored. But if you go to a non-industrialized group, that's not true. And so they don't—it's not only that they're not interested in color.

0
💬 0

2316.431 - 2326.862 Edward Gibson

If you bring bright-colored things to them, they like them just like we like them. Bright colors are great. They're beautiful. But they just don't need to talk about them. They don't have—

0
💬 0

2327.102 - 2344.435 Lex Fridman

So probably color words is a good example of how language evolves from sort of function. When you need to communicate the use of something, then you kind of invent different variations. And basically, you can imagine that the evolution of a language has to do with what the early tribes were doing.

0
💬 0

2345.875 - 2358.238 Lex Fridman

What kind of problems are facing them, and they're quickly figuring out how to efficiently communicate the solution to those problems, whether it's aesthetic or functional, all that kind of stuff, running away from a mammoth or whatever.

0
💬 0

2358.258 - 2369.78 Lex Fridman

I think what you're pointing to is that we don't have data on the evolution of language, because many languages were formed a long time ago, so you don't get the chatter anymore.

0
💬 0

2370.02 - 2391.89 Edward Gibson

We have a little bit of old English to modern English because there was a writing system, and we can see how old English looked. So the word order changed, for instance, in old English to middle English to modern English. And so we could see things like that, but most languages don't even have a writing system. So of the 7,000, Only a small subset of those have a writing system.

0
💬 0

2392.15 - 2411.939 Edward Gibson

And even if they have a writing system, it's not a very modern writing system. And so they don't have it. So we just basically have for Mandarin, for Chinese, we have a lot of evidence for a long time and for English and not for much else. Not for German a little bit, but not for a whole lot of long-term language evolution. We don't have a lot.

0
💬 0

2412.04 - 2414.641 Edward Gibson

We just have snapshots is what we've got of current languages.

0
💬 0

2414.701 - 2437.902 Lex Fridman

Yeah, you get an inkling of that from the rapid communication on certain platforms, like on Reddit, there's different communities, and they'll come up with different slang, usually from my perspective, driven by a little bit of humor, or maybe mockery or whatever, just talking shit in different kinds of ways. And you could see the evolution of language there.

0
💬 0

2438.763 - 2461.284 Lex Fridman

because I think a lot of things on the internet, you don't want to be the boring mainstream. So you like want to deviate from the proper way of talking. And so you get a lot of deviation, like rapid deviation. Then when communities collide, you get like, just like you said, humans adapt to it. And you can see it through the lens of humor.

0
💬 0

2461.704 - 2469.495 Lex Fridman

I mean, it's very difficult to study, but you can imagine like 100 years from now, well, if there's a new language born, for example, we'll get really high resolution data.

0
💬 0

2469.969 - 2490.769 Edward Gibson

I mean, English is changing. English changes all the time. All languages change all the time. So, you know, there's a famous result about the Queen's English. So if you look at the Queen's vowels, the Queen's English is supposed to be, you know, originally the proper way for the talk was sort of defined by whoever the Queen talked or the King, whoever was in charge.

0
💬 0

2491.229 - 2514.624 Edward Gibson

And so if you look at how her vowels changed, from when she first became queen in 1952 or 53, when she was coronated, the first, I mean, that's Queen Elizabeth who died recently, of course, until, you know, 50 years later, her vowels changed, her vowels shifted a lot. And so that, you know, even in the sounds of British English, in her, the way she was talking was changing.

0
💬 0

2515.045 - 2529.873 Edward Gibson

The vowels were changing slightly. So that's just, in the sounds, there's change. I don't know what's, you know, we're, we're, I'm interested. We're all interested in what's driving any of these changes. The word order of English changed a lot over a thousand years, right? So it used to look like German.

0
💬 0

2530.473 - 2550.029 Edward Gibson

You know, it used to be a verb final language with case marking, and it shifted to a verb medial language. A lot of contact. So a lot of contact with French. And it became a verb medial language with no case marking. And so it became this, you know, verb initially thing. And so that's... It's evolving. It totally evolved. And so it may very well... I mean...

0
💬 0

2550.609 - 2557.554 Edward Gibson

You know, it doesn't evolve maybe very much in 20 years is maybe what you're talking about. But over 50 and 100 years, things change a lot, I think.

0
💬 0

2557.674 - 2559.775 Lex Fridman

We'll now have good data on it, which is great.

0
💬 0

2559.795 - 2560.295 Edward Gibson

That's for sure.

0
💬 0

2561.076 - 2565.559 Lex Fridman

Can you talk to what is syntax and what is grammar? So you wrote a book on syntax.

0
💬 0

2565.879 - 2580.86 Edward Gibson

I did. You were asking me before about how do I figure out what a dependency structure is. I'd say the dependency structures aren't that hard generally. I think there's a lot of agreement of what they are for almost any sentence in most languages. I think people will agree on a lot of that.

0
💬 0

2582.421 - 2606.678 Edward Gibson

There are other parameters in the mix such that some people think there's a more complicated grammar than just a dependency structure. And so, you know, like Noam Chomsky, he's the most famous linguist ever. And he is famous for proposing a slightly more complicated syntax. And so he invented phrase structure grammar. So he's... well-known for many, many things.

0
💬 0

2606.739 - 2627.758 Edward Gibson

But in the 50s, in the early 60s, like the late 50s, he was basically figuring out what's called formal language theory. And he figured out sort of a framework for figuring out how complicated a certain type of language might be, so-called phrase-structured grammars of language might be. And so his idea was that maybe

0
💬 0

2630.66 - 2651.906 Edward Gibson

We can think about the complexity of a language by how complicated the rules are. And the rules will look like this. They will have a left-hand side and they'll have a right-hand side. Something on the left-hand side will expand to the thing on the right-hand side. So say we'll start with an S, which is like the root, which is a sentence. And then we're going to expand to things.

0
💬 0

2652.906 - 2674.603 Edward Gibson

like a noun phrase and a verb phrase is what he would say, for instance, okay? An S goes to an NP and a VP is a kind of a phrase structure rule. And then we figure out what an NP is. An NP is a determiner and a noun, for instance. And a verb phrase is something else, is a verb and another noun phrase and another NP, for instance. Those are the rules of a very simple phrase structure, okay?

0
💬 0

2675.263 - 2688.012 Edward Gibson

And so he proposed phrase structure grammar, right? as a way to sort of cover human languages. And then he actually figured out that, well, depending on the formalization of those grammars, you might get more complicated or less complicated languages.

0
💬 0

2688.032 - 2708.662 Edward Gibson

And so he said, well, these are things called, you know, context-free languages, that rule that he thought, you know, human languages tend to be what he calls context-free languages. But there are simpler languages, which are so-called regular languages, and they have a more constrained form to the rules of the phrase structure of these particular rules.

0
💬 0

2708.682 - 2722.526 Edward Gibson

So he basically discovered and kind of invented ways to describe the language. And those are phrase structure, a human language. And he was mostly interested in English initially in his work in the 50s.

0
💬 0

2723.086 - 2729.209 Lex Fridman

So quick questions around all this. So formal language theory is the big field of just studying language formally.

0
💬 0

2729.509 - 2753.902 Edward Gibson

Yes, and it doesn't have to be human language there. We can have computer languages, any kind of system which is generating some set of expressions in a language. And those could be like the... The statements in a computer language, for example. It could be that or it could be human language. So technically you can study programming languages. Yes, and have been.

0
💬 0

2754.002 - 2760.846 Edward Gibson

I mean, heavily studied using this formalism. There's a big field of programming languages within the formal language. Okay.

0
💬 0

2761.186 - 2768.63 Lex Fridman

And then phrase structure grammar is this idea that you can break down language into this S-N-P-V-P type of thing?

0
💬 0

2768.73 - 2791.541 Edward Gibson

It's a particular... formalism for describing language. And Chomsky was the first one. He's the one who figured that stuff out back in the 50s. And that's equivalent, actually. The context-free grammar is actually kind of equivalent in the sense that it generates the same sentences as a dependency grammar would. The dependency grammar is a little simpler in some way.

0
💬 0

2791.561 - 2807.435 Edward Gibson

You just have a root and it goes, like, we don't have any of these, the rules are implicit, I guess. And we just have connections between words. The free structure grammar is kind of a different way to think about the dependency grammar. It's slightly more complicated, but it's kind of the same in some ways.

0
💬 0

2807.595 - 2833.211 Lex Fridman

So to clarify, dependency grammar is the framework under which you see language and you make the case that this is a good way to describe language. And Noam Chomsky is watching this, he's very upset right now, so let's, just kidding, but what's the difference between, where's the place of disagreement? Between phrase structure grammar and dependency grammar.

0
💬 0

2833.251 - 2855.395 Edward Gibson

They're very close. So phrase structure grammar and dependency grammar aren't that far apart. I like dependency grammar because it's more perspicuous, it's more transparent about representing the connections between the words. It's just a little harder to see in phrase structure grammar. The place where Chomsky sort of devolved or went off from this is he also thought there was...

0
💬 0

2856.395 - 2873.814 Edward Gibson

um something called movement okay and so it's so and that's where we disagree okay that's the place where i would say we disagree and and and i mean well maybe we'll get into that later but the idea is if you want to do you want me to explain that no i would love can you explain movement movement okay so you're saying so many interesting things yeah yeah okay so here's the movement is

0
💬 0

2874.335 - 2895.605 Edward Gibson

Chomsky basically sees English and he says, okay, I said, you know, we had that sentence earlier, like it was like two dogs entered the room. Let's change it a little bit, say two dogs will enter the room. And he notices that, hey, English, if I want to make a question, a yes, no question from that same sentence, I say, instead of two dogs will enter the room, I say, will two dogs enter the room?

0
💬 0

2896.085 - 2916.041 Edward Gibson

Okay. There's a different way to say the same idea. And it's like, well, the auxiliary verb, that will thing, it's at the front as opposed to in the middle. Okay. And so, and he looked, you know, if you look at English, you see that that's true for all those modal verbs and for other kinds of auxiliary verbs in English. You always do that. You always put an auxiliary verb at the front.

0
💬 0

2916.841 - 2939.269 Edward Gibson

And when he saw that, so if I say, I can win this bet, can I win this bet, right? So I move a can to the front. So actually, that's a theory. I just gave you a theory there. He talks about it as movement. That word in the declarative is the root, is the sort of default way to think about the sentence, and you move the auxiliary verb to the front. That's a movement theory, okay?

0
💬 0

2939.329 - 2956.475 Edward Gibson

And he just thought that was just so obvious that it must be true. That there's nothing more to say about that, that this is how auxiliary verbs work in English. There's a movement rule such that to get from the declarative to the interrogative, you're moving the auxiliary to the front.

0
💬 0

2956.515 - 2976.349 Edward Gibson

And it's a little more complicated as soon as you go to simple present and simple past, because if I say, you know, John slept, you have to say that. did John sleep, not slept John, right? And so you have to somehow get an auxiliary verb. And I guess underlyingly, it's like slept is, it's a little more complicated than that, but that's his idea. There's a movement, okay?

0
💬 0

2977.07 - 2997.258 Edward Gibson

And so a different way to think about that, that isn't, I mean, then he ended up showing later, right? So he proposed this theory of grammar, which has movement. There's other places where he thought there's movement, not just auxiliary verbs, but things like the passive in English and things like questions, WH questions, a bunch of places where he thought there's also movement going on.

0
💬 0

2997.778 - 3013.465 Edward Gibson

And in each one of those, he thinks there's words, well, phrases and words are moving around from one structure to another, which he called deep structure to surface structure. I mean, there's like two different structures in his theory, okay? There's a different way to think about this. which is there's no movement at all.

0
💬 0

3014.006 - 3032.924 Edward Gibson

There's a lexical copying rule such that the word will or the word can, these auxiliary verbs, they just have two forms. And one of them is the declarative and one of them is the interrogative. And you basically have the declarative one and, oh, I form the interrogative or I can form one from the other. It doesn't matter which direction you go.

0
💬 0

3033.445 - 3052.358 Edward Gibson

And I just have a new entry, which has the same meaning, which has a slightly different argument structure. Argument structure is just a fancy word for the ordering of the words. And so if I say, you know, it was the dog's two dogs can or will enter the room, there's two forms of will.

0
💬 0

3052.759 - 3069.79 Edward Gibson

One is will declarative, and then, okay, I've got my subject to the left, it comes before me, and the verb comes after me in that one. And then the will interrogative, it's like, oh, I go first. Interrogative, will is first, and then I have the subject immediately after, and then the verb after that.

0
💬 0

3070.07 - 3077.255 Edward Gibson

And so you can just generate from one of those words another word with a slightly different argument structure, with different ordering,

0
💬 0

3077.635 - 3092.78 Lex Fridman

And these are just lexical copies. They're not necessarily moving from one to another. There's no movement. There's a romantic notion that you have like one main way to use a word and then you could move it around. Right, right. Which is essentially what movement is implying.

0
💬 0

3092.8 - 3108.666 Edward Gibson

Yeah, but that's the lexical copying is similar. So then we do lexical copying for that same idea that maybe the declarative is the source and then we can copy it. And so an advantage is Well, there's multiple advantages of the lexical copying story. It's not my story.

0
💬 0

3108.726 - 3128.578 Edward Gibson

This is like Ivan Sog, linguists, a bunch of linguists have been proposing these stories as well, you know, in tandem with the movement story. Okay, you know, Ivan Sog died a while ago, but he was one of the proponents of the non-movement of the lexical copying story. And so that is that a great advantage is, well...

0
💬 0

3129.378 - 3153.168 Edward Gibson

Chomsky, really famously in 1971, showed that the movement story leads to learnability problems. It leads to problems for how language is learned. It's really, really hard to figure out what the underlying structure of a language is if you have both phrase structure and movement. It's really hard to figure out what came from what. There's a lot of possibilities there.

0
💬 0

3153.329 - 3156.87 Edward Gibson

If you don't have that problem, the learning problem gets a lot easier.

0
💬 0

3156.97 - 3158.291 Lex Fridman

Just say there's lexical copies.

0
💬 0

3158.311 - 3159.592 Edward Gibson

Yeah. Yeah, yeah.

0
💬 0

3159.752 - 3162.936 Lex Fridman

When we say the learning problem, do you mean humans learning a new language?

0
💬 0

3162.956 - 3186.939 Edward Gibson

Yeah, just learning English. So a baby is lying around listening to the crib, listening to me talk, and how are they learning English? Or maybe it's a two-year-old who's learning interrogatives and stuff. How are they doing that? Are they doing it vocally? So Chomsky said it's impossible to figure it out, actually. He said it's actually impossible, not hard, but impossible.

0
💬 0

3187.5 - 3209.49 Edward Gibson

And therefore, that's where universal grammar comes from, is that it has to be built in. And so what they're learning is that there's some built-in movement that's built in in his story. It's absolutely part of your language module. And then you are... you're just setting parameters. You're said, depending on English, it's just sort of a variant of the universal grammar.

0
💬 0

3209.65 - 3233.184 Edward Gibson

And you're figuring out, oh, which orders does English do these things? The non-movement story doesn't have this. It's like much more bottom-up. You're learning rules. You're learning rules one by one. And, oh, this word is connected to that word. Another advantage, it's learnable. Another advantage of it is that it predicts that not all auxiliaries might move.

0
💬 0

3233.444 - 3259.432 Edward Gibson

It might depend on the word, depending on whether you... And that turns out to be true. So there's words that don't really work as auxiliary. They work in declarative and not in interrogative. So I can say, I'll give you the opposite first. I can say, aren't I invited to the party? And that's an interrogative form. But it's not from, I aren't invited to the party. There is no I aren't, right?

0
💬 0

3259.472 - 3282.16 Edward Gibson

So that's interrogative only. And then we also have forms like ought. I ought to do this. And I guess some old British people can say— Ought I. Exactly. It doesn't sound right, does it? For me, it sounds ridiculous. I don't even think ought is great, but I mean, I totally recognize I ought to do it. It's not too bad, actually. I can say I ought to do this. That sounds pretty good.

0
💬 0

3282.2 - 3283.781 Lex Fridman

If I'm trying to sound sophisticated, maybe.

0
💬 0

3284.321 - 3306.552 Edward Gibson

I don't know. It just sounds completely out to me. Odd eye. Anyway, so there are variants here. And a lot of these words just work in one versus the other. And that's fine under the lexical copying story. It's like, well, you just learn the usage. Whatever the usage is, is what you do with this word. But it's a little bit harder in the movement story.

0
💬 0

3306.612 - 3319.661 Edward Gibson

The movement story... That's an advantage, I think, of lexical copying. In all these different places, there's all these usage... which make the movement story a little bit harder to work.

0
💬 0

3319.801 - 3330.05 Lex Fridman

So one of the main divisions here is the movement story versus the lesson, the copy story. That has to do about the auxiliary words and so on. But if you rewind to the phrase structure grammar.

0
💬 0

3330.07 - 3355.057 Edward Gibson

Yeah. versus dependency grammar. Those are equivalent in some sense in that for any dependency grammar, I can generate a phrase structure grammar which generates exactly the same sentences. I just like the dependency grammar formalism because it makes something really salient, which is the lengths of dependencies between words, which isn't so obvious in the phrase structure.

0
💬 0

3355.097 - 3360.438 Edward Gibson

In the phrase structure, it's just kind of hard to see. It's in there. It's just very, very, it's opaque.

0
💬 0

3360.878 - 3381.763 Edward Gibson

uh technically i think phrase structure grammar is mappable to dependency grammar and vice versa and vice versa yeah but there's like these like little labels s and pvp yeah for a particular dependency grammar you can make a phrase structure grammar which generates exactly those same sentences and vice versa but there are many phrase structure grammars which you can't really make a dependency grammar

0
💬 0

3381.943 - 3395.069 Edward Gibson

I mean, you can do a lot more in a phrase structure grammar, but you get many more of these extra nodes, basically. You can have more structure in there. And some people like that, and maybe there's value to that. I don't like it.

0
💬 0

3395.809 - 3414.657 Lex Fridman

Well, for you, so we should clarify, so dependency grammar is just, well, one word depends on only one other word, and you form these trees, and that makes, it really puts priority on those dependencies, just like as a There's a tree that you can then measure the distance of the dependency from one word to the other.

0
💬 0

3414.997 - 3433.485 Lex Fridman

They can then map to the cognitive processing of the sentences, how easy it is to understand and all that kind of stuff. So it just puts the focus on just like the mathematical distance of dependence between words. So it's just a different focus.

0
💬 0

3434.346 - 3434.786 Edward Gibson

Absolutely.

0
💬 0

3435.206 - 3454.035 Lex Fridman

Just continue on the thread of Chomsky because it's really interesting. Because as you're... discussing disagreement, to the degree there's disagreement, you're also telling the history of the study of language, which is really awesome. So you mentioned context-free versus regular. Does that distinction come into play for dependency grammars?

0
💬 0

3454.376 - 3481.553 Edward Gibson

No. Okay. Not at all. I mean, regular languages are too simple for human languages. It's a part of the hierarchy, but human languages in the phrase structure world are at least context-free, maybe a little bit more, a little bit harder than that. So there's something called context-sensitive as well, where you can have, like this is just the formal language description,

0
💬 0

3482.753 - 3488.718 Edward Gibson

In a context-free grammar, you have one... This is like a bunch of formal language theory we're doing here.

0
💬 0

3488.978 - 3489.458 Lex Fridman

I love it.

0
💬 0

3489.558 - 3509.931 Edward Gibson

Okay. So you have a left-hand side category, and you're expanding to anything on the right. That's a context-free. The idea is that that category on the left expands in independent of context to those things, whatever they are on the right. It doesn't matter what. And a context-sensitive... says, okay, I actually have more than one thing on the left.

0
💬 0

3510.132 - 3528.999 Edward Gibson

I can tell you only in this context, you know, maybe you have like a left and a right context or just a left context or a right context. I have two or more stuff on the left tells you how to expand those things in that way. Okay, so it's context sensitive. A regular language is just more constrained. And so it It doesn't allow anything on the right.

0
💬 0

3529.439 - 3548.628 Edward Gibson

It allows very... Basically, it's one very complicated rule is kind of what a regular language is. And so it doesn't have any... I was going to say long-distance dependencies. It doesn't allow recursion, for instance. There's no recursion. Yeah, recursion is where you... Human languages have recursion. They have embedding.

0
💬 0

3549.129 - 3555.613 Edward Gibson

And you can't... Well, it doesn't allow center-embedded recursion, which human languages have, which is what... Center-embedded recursion.

0
💬 0

3555.633 - 3556.954 Lex Fridman

So within a sentence? Within a sentence.

0
💬 0

3556.974 - 3573.75 Edward Gibson

Yeah, within a sentence. So here we're going to get to that. But the formal language stuff is a little aside. Chomsky wasn't proposing it for human languages even. He was just pointing out that human languages are context-free. Because that was kind of stuff we did for formal languages. And what he was most interested in was

0
💬 0

3574.744 - 3589.552 Edward Gibson

human language, and that's like, the movement is where we, where he sort of set off on the, I would say, a very interesting, but wrong foot. It was kind of interesting, it's a very, I agree, it's a very interesting history. So he proposed this,

0
💬 0

3590.092 - 3616.995 Edward Gibson

multiple theories in 57 and then 65 there they all have this framework though was phrase structure plus movement different versions of the of the phrase structure and the movement in the 57 these are the most famous original bits of chomsky's work and then 71 is when he figured out that those lead to learning problems that that there's cases where a kid could never figure out which rule um which set of rules was intended and and so and then he said well that means it's innate

0
💬 0

3617.475 - 3638.022 Edward Gibson

It's kind of interesting. He just really thought the movement was just so obviously true that he couldn't... He didn't even entertain giving it up. It's just obvious. That's obviously right. And it was later where people figured out that there's all these subtle ways in which things which look like generalizations aren't generalizations across the category.

0
💬 0

3638.262 - 3659.812 Edward Gibson

They're word-specific, and they kind of work, but they don't work across various other words in the category. And so it's easier to just think of these things as lexical copies. And I think he was very obsessed. I don't know. I'm just guessing. He really wanted this story to be simple in some sense. And language is a little more complicated in some sense. He didn't like words.

0
💬 0

3660.792 - 3681.219 Edward Gibson

He never talks about words. He likes to talk about combinations of words. And words are... You know, if you look up a dictionary, there's 50 senses for a common word, right? The word take will have 30 or 40 senses in it. So there'll be many different senses for common words. And he just doesn't think about that. He doesn't think that's language. I think he doesn't think that's language.

0
💬 0

3681.259 - 3699.045 Edward Gibson

He thinks that words are distinct from combinations of words. I think they're the same. If you look at my brain in the scanner while I'm listening to a language I understand, And you compare, I can localize my language network in a few minutes, in like 15 minutes.

0
💬 0

3699.145 - 3715.795 Edward Gibson

And what you do is I listen to a language I know, I listen to, you know, maybe some language I don't know, or I listen to muffled speech, or I read sentences and I read non-words. Like I can do anything like this, anything that's sort of really like English and anything that's not very like English. So I've got something like it and not, and I got to control.

0
💬 0

3716.175 - 3743.701 Edward Gibson

And the voxels, which is just, you know, the 3D pixels in my brain that are responding most, is a language area. And that's this left lateralized area in my head. And wherever I look in that network, if you look for the combinations versus the words, it's everywhere. It's the same. That's fascinating. And so it's like hard to find, there are no areas that we know. I mean, that's,

0
💬 0

3745.342 - 3761.436 Edward Gibson

It's a little overstated right now. At this point, the technology isn't great. It's not bad. But we have the best way to figure out what's going on in my brain when I'm listening or reading language is to use fMRI, functional magnetic resonance imaging. And that's a very good localization technique.

0
💬 0

3761.596 - 3787.65 Edward Gibson

method so i can figure out where exactly these signals are coming from pretty you know down to you know millimeters you know cubic millimeters or smaller okay very small we can figure those out very well the problem is the when okay uh it's it's measuring um oxygen okay and oxygen takes a little while to get to those cells so it takes on the order of seconds so i talk fast i probably listen fast and i can probably understand things really fast so a lot of stuff happens in two seconds

0
💬 0

3787.89 - 3807.948 Edward Gibson

And so to say that we know what's going on, that the words right now in that network, our best guess is that whole network is doing something similar, but maybe different parts of that network are doing different things. And that's probably the case. We just don't have very good methods to figure that out right at this moment. And so...

0
💬 0

3809.031 - 3823.804 Lex Fridman

Since we're kind of talking about the history of the study of language, what other interesting disagreements, and you're both at MIT, or were for a long time, what kind of interesting disagreements there, tension of ideas are there between you and Noam Chomsky?

0
💬 0

3823.824 - 3849.078 Lex Fridman

And we should say that Noam was in the linguistics department, and you're, I guess for a time were affiliated there, but primarily brain and cognitive science department. which is another way of studying language, and you've been talking about fMRI. Is there something else interesting to bring to the surface about the disagreement between the two of you, or other people in the discipline?

0
💬 0

3849.098 - 3877.271 Edward Gibson

Yeah, I mean, I've been at MIT for 31 years, since 1993, and Chomsky's been there much longer. So I met him, I knew him, I met when I first got there, I guess, and we would interact every now and then. I'd say our biggest difference is our methods. And so that's the biggest difference between me and Noam, is that I gather data from people.

0
💬 0

3877.871 - 3901.007 Edward Gibson

I do experiments with people and I gather corpus data, whatever, whatever corpus data is available. And we do quantitative methods to evaluate any kind of hypothesis we have. He just doesn't do that. So, you know, you, you know, he has never once been associated with any experiment or corpus work ever. And so it's all thought experiments. It's his own intuitions.

0
💬 0

3901.147 - 3918.299 Edward Gibson

So I just don't think that's the way to do things. Yeah. That's an across-the-street-there-across-the-street-from-us kind of difference between Brain and CogSci and linguistics. I mean, some of the linguists, depending on what you do, more speech-oriented, they do more quantitative stuff.

0
💬 0

3918.439 - 3929.676 Edward Gibson

But in the meaning, words and, well, it's combinations of words, syntax, semantics, they tend not to do experiments and... and corpus analyses.

0
💬 0

3929.776 - 3943.136 Lex Fridman

That's the biggest method. But the method is a symptom of a bigger approach, which is sort of a psychology philosophy side on GNOME, and for you, it's more sort of data-driven, sort of almost like a mathematical approach.

0
💬 0

3943.576 - 3964.084 Edward Gibson

Yeah, I mean, I'm a psychologist. So I would say we're in psychology. Brain and Cognitive Science is MIT's old psychology department. It was a psychology department up until 1985, and it became the Brain and Cognitive Science department. And so, I mean, my training is math and computer science, but I'm a psychologist. I mean, I don't know what I am.

0
💬 0

3964.324 - 3966.767 Lex Fridman

So data-driven psychologists, well, you are.

0
💬 0

3966.787 - 3973.415 Edward Gibson

I am what I am, but I'm happy to be called a linguist, I'm happy to be called a computer scientist, I'm happy to be called a psychologist, any of those things.

0
💬 0

3973.755 - 3983.167 Lex Fridman

But in the actual, like how that manifests itself outside of the methodology is like these differences, these subtle differences about the movement story versus the lexical copy story.

0
💬 0

3984.067 - 3999.171 Edward Gibson

Those are theories. But I think the reason we differ in part is because of how we evaluate the theories. And so I evaluate theories quantitatively, and Noam doesn't. Got it.

0
💬 0

3999.591 - 4015.989 Lex Fridman

Okay, well, let's explore the theories that... You explore in your book. Let's return to this dependency grammar framework of looking at language. What's a good justification why the dependency grammar framework is a good way to explain language? What's your intuition?

0
💬 0

4016.621 - 4032.12 Edward Gibson

So the reason I like dependency grammar, as I've said before, is that it's very transparent about its representation of distance between words. So it's like, all it is, is you've got a bunch of words, you're connecting together to make a sentence. And...

0
💬 0

4033.895 - 4053.66 Edward Gibson

a really neat insight which turns out to be true is that the further apart the pair of words are that you're connecting the harder it is to do the production the harder it is to do the comprehension it's harder to produce hard to understand when the words are far apart when they're close together it's easy to produce and it's easy to comprehend let me give you an example okay so

0
💬 0

4054.744 - 4081.908 Edward Gibson

We have, in any language, we have mostly local connections between words, but they're abstract. The connections are abstract, they're between categories of words. And so you can always make things further apart if you add modification, for example, after a noun, so a noun in English comes before a verb, the subject noun comes before a verb, and then there's an object after, for example.

0
💬 0

4081.948 - 4090.092 Edward Gibson

So I can say what I said before, you know, the dog entered the room or something like that. So I can modify dog. If I say something more about dog after it, then what I'm doing is,

0
💬 0

4090.992 - 4117.323 Edward Gibson

indirectly i'm lengthening the dependence the dependence between dog and entered by adding more stuff to it so i just make just make it explicit here if i say um uh the the boy who the cat scratched cried we're going to have a mean cat here And so what I've got here is the boy cried. It would be a very short, simple sentence. And I just told you something about the boy.

0
💬 0