
Sean Carroll's Mindscape: Science, Society, Philosophy, Culture, Arts, and Ideas
301 | Tina Eliassi-Rad on Al, Networks, and Epistemic Instability
Mon, 13 Jan 2025
Big data is ruling, or at least deeply infiltrating, all of modern existence. Unprecedented capacity for collecting and analyzing large amounts of data have given us a new generation of artificial intelligence models, but also everything from medical procedures to recommendation systems that guide our purchases and romantic lives. I talk with computer scientist Tina Elassi-Rad about how we can sift through all this data, make sure it is deployed in ways that align with our values, and how to deal with the political and social dangers associated with systems that are not always guided by the truth.Support Mindscape on Patreon.Blog post with transcript: https://www.preposterousuniverse.com/podcast/2025/01/13/301-tina-eliassi-rad-on-al-networks-and-epistemic-instability/Tina Eliassi-Rad received her Ph.D. in computer science from the University of Wisconsin-Madison. She is currently Joseph E. Aoun Chair of Computer Sciences and Core Faculty of the Network Science Institute at Northeastern University, External Faculty at the Santa Fe Institute, and External Faculty at the Vermont Complex Systems Center. She is a fellow of the Network Science Society, recipient of the Lagrange Prize, and was named one of the 100 Brilliant Women in AI Ethics.Web siteNortheastern web pageGoogle Scholar publicationsWikipediaSee Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.
With a Spark Cash Plus card from Capital One, you earn unlimited 2% cash back on every purchase. Plus, no preset spending limit helps your purchasing power adapt to meet your business needs. Jorge Gaviria, founder of Masienda, reinvests his 2% cash back to help grow the business with new products. What could the Spark Cash Plus card from Capital One do for your business?
Capital One, what's in your wallet? Find out more at CapitalOne.com slash Spark Cash Plus. Terms and conditions apply.
Hello, everyone, and welcome to the Mindscape Podcast. I'm your host, Sean Carroll. There's a kind of history myth that sometimes gets promulgated in, I don't know, elementary schools, maybe, or just folk tales we tell each other. According to which, when the first European explorers landed in the New World...
The indigenous folks saw them and thought, oh, my goodness, these are gods coming to visit us and we need to worship them and they're too powerful to deal with. Turns out nothing like that is actually true. This is a story that the Europeans made up after the fact to make themselves look good, justify some of the things that happened.
Nowadays, we are being faced with a new set of visitors from another world, namely artificial intelligences, whether it's large language models or some other kind of constructed program that in many ways can act human, but has a different set of capacities and we're learning to deal with them.
And unlike the myth of the European explorers landing in the Western Hemisphere, today there are a bunch of people who quite literally who are very willing to say that these are gods coming to deal with us. I know there's also plenty of skepticism out there, but there are people who think not only that AIs are gods,
going to be in our human-level intelligence and agency, but well beyond that, superhuman, godlike creatures that we're going to have to deal with. I am myself not of that opinion. I do not think that that is actually what is going on. But just like the landing explorers, AIs do have different capacities than we do. They're trained, of course. They're designed.
They're made to, in many ways, act very human. But they're really not. They're thinking in a different way. They're capable of some things much better than we are and other things not nearly as good as we are. So how do we think about this?
This world in which interacting with AIs, interacting with computerized systems more broadly is going to be a crucially important part of how we live our lives. Today's guest is Tina Eliassi-Rod, who is a computer scientist whose work spans the space. And this is why I really like it. From very technical stuff, just, you know, how do you better detect things?
certain nodes or communities in an abstract network that you have embedded as some sort of data, but then also the human side of how you deal with this stuff, how these computer systems, how these AIs are going to affect our lives and we're going to affect them all the way up to human AI co-evolution.
Once we build these systems and then we interact with them and then we use them to decide how to go shopping or decide how to find a romantic partner, guess what? That affects who we are, how we live our lives, and the survival strategies we're going to have to move forward in this very brave new world. Again, many positive aspects here. There are things that, you know,
We don't want to do, we don't want to bother doing or it's hard to do for us as human beings that we can outsource to the AIs. There are other ways in which it's very dangerous. The biases, the bad things that we have in our own brains can be inherited by the AIs and they can have new failure modes that we human beings don't have.
It's a world that is changing super duper rapidly, obviously, as a lot of research is coming in and a lot of influences are out there. It's not all about necessarily writing the best program. Some people who are very good at writing programs want to optimize for making the best money, right?
And we have to take that into consideration when we consider what to do, how to regulate, how to control, how to optimize for our own actual goals, rather than just seeing what happens next and living with the consequences. So the more informed we are about what the possibilities are and how to deal with them, the more we'll be able to do that. So let's go.
Tina Eliassi-Rod, welcome to the Mindscape Podcast.
Thank you. Thank you for having me.
Normally, I like to start the conversation with someone talking about the most basic stuff, the things everyone knows about. For your stuff, I kind of feel like going in reverse order. We'll end with the fun stuff about AI and democracy and things like that, but let's start with...
understanding graphs and networks and things like that, especially using neural networks to understand things that human brains can't quite wrap their minds around. So what is the most general way of stating what it is that you're trying to understand when it comes to thinking about graphs and networks?
Well, when you're trying to understand the phenomena, usually you have multiple entities, like multiple people, and they have relationships with each other, right?
And so when we're looking at graph, like machine learning with graphs or graph mining, we're trying to find those, what we're calling relational dependencies, that like the probability of you and me being friends, given that we both like Apple products, is greater than the probability of you and me just being friends.
Or the probability of me liking Apple products, given that we're friends, is more than the probability the prior probability of each of us liking an Apple product. So the second one that is, we are friends, you influence me. And so I like Apple products and I buy Apple products or I buy this headphone, right? Headset. And the first one is that because we like similar things, we become friends.
This notion of homophily or like birds of a feather flock together. But in a nutshell, like people who work on, Machine learning on graphs, network scientists who are interested in understanding phenomena, network sciences and interdisciplinary discipline. It is about these relational dependencies and like, what can we find? What are the patterns?
What are the anomalies in the relationships that get formed?
So for the audience who wasn't there, what Tina is not telling you is that we spent 10 minutes before the podcast struggling with our Apple products to make the recording work, but we still use them. So, you know, I guess take whatever lessons from that. Okay, but I guess in the current era, the issue is you have...
Too much data, or at least in principle, one would like to imagine having too much data. There's like so much stuff, right? Is a large part of the worry like how to pick and choose what to pay attention to, what to draw connections between?
Yeah, there's some of that. I would say that, so I have this thing I call the paradox of big data, which is like there's a lot of data, but to predict specifically for what Tina wants, it's difficult, right? You don't have maybe as much information about Tina.
Now, if Tina belongs into some majority group, then maybe you can aggregate from the majority and say, well, Tina is part of this flock, and so Tina will like whatever this flock likes, right? Um, but really I feel like the problem these days is more about, uh, exploitation and going with things that are popular, um, than, um, exploration, right?
Like in the past we would go to the library or the bookstore and you're looking for a book and you would find other things. And those were, you know, they basically did that. The cherry on top of the cake. Right. The cream is like, oh, yeah, I found this. Right. And now we're really not getting that. Right.
So when you use all these recommendation systems, whether it's Google or any other Amazon, et cetera, they oftentimes show you what is popular or what they believe you would like. Right. So in a past life, I worked at Lawrence Livermore National Laboratory, which is a physics laboratory.
And like when I would do searches there, and this is many years ago, I would get more like physics books than like when I lived elsewhere. They would sell me they wouldn't show me as much physics books, right, just based on the location, the zip code. And so there's some of that that's going on. And I feel like that is more of the problem of like not really serving the individual or exploring.
as much as possible.
So thinking though, like purely like a mathematician or a computer scientist, faced with these big networks, how should we think about them? What are the tools that we use to tease out what are the important relationships?
Yeah, so, you know, it depends on what kind of network it is, right? So in social networks, for example, we know that there are two dominant processes that form social networks. One is closing of what we're calling wedges. So if I am friends with you and you are friends with Jennifer, then I will become friends with Jennifer, right? We close that triangle.
And in fact, if you and I have, for example, many common friends, or let's say me and Jennifer in my example, we have many common friends and we are not friends, then there is something going on, that there was lots of opportunities that we could become friends, but we chose not to become friends, right?
Now, there's also, of course, partial observability in that, like, maybe I didn't observe it, right? However big your data is, you're not omniscient, you don't see things, right? But we do expect that friend of a friend is also a friend. That's one. The other one is this notion of preferential attachment, right? That everybody wants to connect to a star.
And so you're interested in like, basically those are the two big patterns. And then you look at deviations from that. So a work that was done by John Kleinberg at Cornell is, He's a very well-known computer science professor. This is a while back, was think Facebook, for example. Who is your romantic partner on Facebook?
And he and his colleagues showed that basically you are the center of a flower and you have petals around you. These petals could be your high school buddies or college buddies, etc. They have just more triangles in them. And people who fall outside of these petals and have a lot of connections to these petals are either your sibling or your romantic partner.
That is, you are introducing them to other facets of your life. And they show that when that connections stopped, establishment of those connections stopped, it's a leading indicator that you will break up.
Uh-oh.
Yeah. So you were talking about which connections to pay attention to, right? It's like, so those are some of the things that are fun when you look at social networks. I mean, biological networks are totally different. So in biological networks, it's a whole other ball of wax. There's not like, you're not looking for common friends.
You're looking more for like complementarity between different proteins that serve some function.
So it's interesting because it seems like an attempt to go from syntax to semantics in some sense, right? You're going from structure to meaning, broadly speaking.
You're trying to understand what is going on, what is the underlying process that is happening in this network and why these links exist. Now, the one thing that makes studying of graphs and networks really interesting is that it is not a closed world. So just because you didn't see a link between me and Jennifer doesn't mean that we're not friends.
And so for machine learning where you need both positive examples and both negative examples, which negative examples do you pick becomes difficult because the edges or the links or the friendships that don't exist may because like they don't want to be friends or for other reasons. And so this what are the negative examples becomes an important aspect of things.
Well, or as you were giving the example, I was thinking, I don't interact with my romantic partner on social media that much because we interact in real world. Like we don't need that.
Indeed, indeed. So there are lots of assumptions being made, obviously, in terms of like how the network is being observed. And in fact, this is one of the big differences between computer scientists and
that study graphs and network scientists that are typically physicists or social scientists, where, for example, they're like, well, there's a distribution and this graph fell from it versus like the machine learning graph mining folks typically don't question where the graph came from. They're like, oh, here's data and they run with it. Right.
And it's just it boggles the mind that like you should think about where this data came from, how it was collected, What were maybe the errors in collecting it? And in fact, this touches on a sore point for me because what happens is they don't question the data, right? They just like feed it into their machine learning AI models. And then on the other end, they don't measure any uncertainty.
So like if you have something like, let's say, a social network that you've observed, there's all this stuff about like representation learning, right? Where basically I take Tina in the social network and I represent her as a vector in a Euclidean space, right? Like maybe with 60,000, a vector with 16,000 elements in it. So the cardinality is 16,000 and there's no uncertainty.
They're like, no, Tina falls exactly here and it just doesn't make sense at all, right? And so then those kinds of models, given that, You didn't start with, okay, well, my data could have some noise in it, some uncertainty in it. And then you don't even capture the uncertainty of the model at the end.
It just, there are lots of problems that can occur, including, for example, adversarial attacks or like your model is not just going to be, your model is not going to be robust. Let's just put it that way.
Thank you. Thank you. Thank you. helping you write new chapters and create the meaningful story you deserve to live. Therapy empowers you to be the best version of yourself, and even taking the step of starting therapy puts you in control of what's happening next. So if you're interested in therapy, consider BetterHelp.
BetterHelp is fully online, which makes therapy affordable and convenient, serving over 5 million people worldwide. You can access a diverse network of more than 30,000 credentialed therapists with a wide range of specialties. And you can easily switch therapists anytime at no extra cost. So write your story with BetterHelp.
Visit betterhelp.com slash mindscape today to get 10% off your first month. That's betterhelp, H-E-L-P dot com slash mindscape. Well, this sounds just like full employment for enthusiastic graduate students, right? Because how hard could it be? I mean, it could be hard, but it's very well defined, the problem that you just set out.
I mean, allow for the existence of noise in these descriptions and see how your answers change.
Yeah, I think in part, one of the reasons that folks, at least in the CS side, the computer science and the machine learning side, aren't too bothered by it these days is because we are going through this era where prediction is everything. Prediction and accuracy is everything. And so, you know, there are these benchmarks and it's basically benchmark hacking or state of the art hacking, right?
And that's basically what is going on. You know, that's the reality of it, you know. And and so so there's a lot of that kind of engineering going on as opposed to like really thinking about what is the phenomena that I'm interested in? How is the data coming to me? What are the sources of noise? Should I how should I take them into account? Should I even take them into account?
And what are the uncertainties in terms of the predictions that I am outputting?
Let's help the audience understand the idea of benchmark hacking, because that's probably a cool but important one. I mean, what's a benchmark and how do you hack it?
Yeah, so basically you create a bunch of data and you get a buy-in from the community that these are good data sets to test a machine learning or an AI model on. And then there's a leaderboard and you want to be number one. Right. And so you hack the systems that exist or you hack your own system. You create your own to be number one, you know, as as much as possible.
And that's basically what is going on. And I like that.
this metaphor so my colleague um barabaschi said it's like there are two camps there's like a toolbox it's a finite toolbox right and the machine learning the ai people the engineers put tools into that toolbox and because it's finite it's very competitive that is my tool beats your tool even if it's like one percent by one percent that it's not clear if it's statistically significant or not and i may be king for only 30 seconds
because another tool comes in, right? And then there's like the scientists on the other end that just open the toolbox and say, okay, well, what is good for whatever, you know, whatever prediction task I want to do. And then they pick a tool out of that.
And so a lot of this like benchmark hacking or state of the art hacking happens on the engineering, on the AI machine learning side, the computer science side, because you want your tool in that finite toolbox.
But on the science side, the physicist or social science side, the people who are interested in these models that create the sets of data you have, there's also, as I understand it, a lot of worry about degeneracy or overdetermination or underdetermination where very different physical models could give you essentially the same kind of graph or network. How big of a problem is that?
It is a very big problem. I mean, there are multiple angles to this. So one is, for example, because of all the hype, oftentimes people on the engineering side don't talk about the assumptions that they have made or the technical limitations of their system. Because of that, we have this reproducibility problem.
So not even a replicability problem, but a reproducibility problem, which is just a code. Can I just reproduce your code as you have it? Right. And even with your training data, even with like how you broke it up with these different like folds or whatever, you know, and so which is like very, very, very low bar to pass.
But that doesn't happen because there are lots of assumptions that are being made, etc. Then there's this notion of we are living through this era of big models. I want a model that has many, many, many parameters, even if I don't need all those many parameters. Or for example, maybe I do care about interpretability. That is, I want to know what the model is actually doing.
But because, again, for that one or two percentage point on the prediction side, you let go of it and you go with the bigger models. But yes, it's a big, big problem. For me, the lowest bar would be that we require, at least with federal funding, and in some of the service that I do for the federal government, I've been pushing this.
I'm not going to be a very popular person, but that if you get taxpayer dollars from in your reports to the government, you have to have a section on assumptions and technical limitations. Because the problem is the way the peer review culture goes is that if I have a technical limitation section in my paper, the reviewer will just copy and paste it and say reject, right?
But the federal government isn't going to do that, right? NSF isn't going to do that. NSF has already given you the money and you're doing the annual report. And so it has to be, come on, just be honest, right? Like I did not test this method on biological networks and they're very different than social networks. So like caution,
Well, this is because what you do for a living matters a lot to the real world and to money and things like that, unlike the foundations of quantum mechanics that I do. I don't need to worry about people being overly concerned with the results. They're all willing to give me a hard time anyway. Okay, so I have this sort of philosophical, mathematical problem. I don't know.
I mean, if I have a graph, a big graph, so some nodes, some edges that are relationships, and I have a different graph, are there measures of similarity between them? Like if I add one node to the graph, is it a completely different graph? Or is there a metric I could put on there? How much is that even understandable?
Yeah, I love that problem. I've thought about that problem a lot. So the issue there is similarity is an eye of the beholder, right? And it depends on the task itself. So similarity is an ill-defined problem. And so you can say, okay, well, I can go with something like an edit distance. Like, okay, how many new nodes do I have to add to graph number two?
And how many new edges do I have to add or remove to make it look like the other graph? And then try to solve the computationally hard problem of isomorphism. In fact, alignment, right? And in many cases, you don't need alignment, right? So, for example, you can think about two networks and you have started a process of information diffusion on it, like you started a rumor, let's say, right?
And you would just measure, like, how similar does this rumor, the same rumor, travel through network one versus network two? And if like, you know, it travels similarly, let's say, you know, I'm going to throw some jargon, like the stationary distribution of a random walker that is spreading this rumor becomes the same at the end. You would say the networks are similar enough. Right.
And so you don't need to have like the sizes exactly be the same. So it could be, for example, you have a social network of France and a social network of Luxembourg and you start a rumor in France and in Luxembourg. And they are processing the same way. And you would say the networks are similar, even though one is much, much bigger than the other.
That makes sense. In fact, because that because I was going to ask about when you have a big graph and you somehow coarse grain it. Right. Or, you know, you group subgroups into single nodes. You want to somehow have the feeling that it's still representing the same thing, even though you've thrown away a lot of information.
Yeah, yeah, now the problem with grouping nodes, this is a very important problem and it's been studied by lots of people. Within graphs, it's called community detection. Basically you want to group similar nodes together. Now you can have different functions that you define about what similarity there means. It could mean that these people just talk to each other more, right?
So there's more connections between them than what you would expect in a random world, right? or just more connections between them than other folks. Now, this kind of community detection, Aaron Closet, who's a professor at Colorado, showed that there's no free lunch theorem there. And actually, it was Aaron Closet and others. And I think actually Aaron was the last author.
So I think the first author is Leto Peel. But you know how it is. You usually just name your friend.
Yeah, I do know. Yeah.
My apologies to the other authors. But they showed it in no free lunch theorem, which basically means that it is not the case that there is like one particular group of or one particular collection of nodes that you're grouping that would give you the best or the best. true communities. You see what I mean?
So because when you are doing these grouping of nodes, you have some objective function that you're trying to maximize. And basically the idea is that there is no one peak there. So there's not like one particular community that you can put Tina on and say, okay, Tina belongs here. That's where she has to sit. And so some of that becomes an issue.
But this notion of what does it mean for one network to be similar to another network has its tentacles to community detection, to clustering of nodes, and all of those are ill-defined. So it really is driven by the task at hand.
Okay. I mean, I guess I'm spoiled by caring about what probably in your world would be the simplest possible case, because I think about the emergence of space from some set of quantum entanglements or something like that. And it sounds all very fancy and highbrow, but
Basically, something is entangled with something else if it's next to it, and there's this very similar spatial or a very simple-minded spatial coherence. But, of course, in social networks, I can be connected to people anywhere, and that makes it a more complicated problem.
Yeah, and that becomes what we call the small world problem, right? Or the Kevin Bacon or the Erdős number, right? You don't have to go that far out. to be connected to famous people.
And so I mean, how good are we these days at detecting real clusters, communities, figuring out what's going on just from knowing about a graph and the connections between the nodes?
I mean, for downstream tasks that you can like have some, let's say, confusion matrix where you can draw like true positives, false positives, true negatives, false negatives. We're actually very good at it. But if it's about like, OK, I found these communities and do these communities make sense?
It kind of breaks down into whether they're like hard clustering where you put Tina into just one community or you put Tina into multiple communities. And then there's a little bit of just like eyeballing it in a way. If you do not have this downstream task that you can say, okay, here are the true positives, here are the false positives, and so on and so forth.
But in many cases, it's difficult to place a person in a social network only in one community because people are multifaceted.
Right. But you started with an example of being given recommendations by Amazon or whatever, and sometimes the algorithm fails because it's not picking up our individual idiosyncrasies. It's just giving us the most popular thing. Right. Is that tie in to the well-known problem of polarization or extremization of network recommendations?
Like everyone is pushed to some slightly more extreme set of YouTube videos or Reddit posts or whatever?
I think they're in part, they just want your attention. And so the objective function is such that, you know, they just want to hold your attention. And so they will show you whatever necessary that will keep your attention.
And so if they believe that like my tie to Brandon is very strong, that we have a strong relationship and Brandon found these things interesting, then they will show it to me as well to just test it, to see whether, you know, they can capture my attention. And then through that, they can show me more ads.
I guess that makes perfect sense. So like the point is, if Amazon wants to recommend things to me, it's not maximizing the chance that I want this, it's maximizing its profit.
Exactly. Exactly. And so they kind of go hand in hand. And in fact, this touches on this issue that we have written a couple of times about. There was a Nature Perspective piece a while back and more recently an AI Journal piece on this. in a way like human AI co-evolution.
So if you think about it, when you're using Amazon, when you're using YouTube, when you're using Google, you're providing data for them. We talked about this, right? And they take that data into account and they make recommendations. Those recommendations then affect what you do in the real life. And then you go back and you provide them more training data.
And so there's this kind of feedback loop that goes on and on. And it's oftentimes not captured in terms of who's influencing who most. And one example that I like here is like think about dating apps.
There was a story recently from Stanford that like most people are meeting on online dating apps these days instead of like through college or through their friends, family, et cetera, or at the local bar. Now, those dating apps have recommendation systems, right? And based on those recommendation systems, perhaps you meet somebody, you partner up, and you have babies.
And so over time, these recommendation systems actually have an impact on our gene pool going forward.
Oh, wow, okay, yeah. I had not quite gotten that far, right. Right.
Yeah, but it's like, I suppose, and because these recommendation systems are all about exploitation and not exploration, but maybe you would say like my aunt or my grandmother or my college were also all based on exploitation and not exploration, right? But there is this notion that there are these algorithms that we can't understand what they're doing.
And perhaps 100 years from now, they may influence how our genome is evolving.
Well, we are part of the world and we create the world and it reflects back on us, right? I mean, it reminds me a little bit of discussions about extended cognition theories where you count your calculator and your pad of paper and whatever is part of your brain because you keep information there, you do calculations, et cetera.
And so our environment and who we are is being increasingly populated by these artificial algorithms that we put out there.
Yeah, I don't know, like, how far do we think certain things are going? And society has to decide. Like, for example, New York Times had this article a while back about how there's a person who's trying to set up a company, an online dating company, where, like, on the first or second dates, which are usually...
you know, not very good, my avatar and your avatar will go on the date and then they will report back. And only if, you know, both avatars are happy, then on the third date, we actually go out on the date. And so like how much of actually our human behavior are these things going to take over?
So I didn't see this article. What's your actual opinion? Is there any chance that that would help?
I think like I'm an introvert, so I'm like, oh, and also I'm a computer scientist. I'm like, oh, this is great. Let somebody else do the dirty work. And then maybe, you know, if it's a good day, I'll get out of my cave and I'll like go and talk to. But, you know, I for extroverts, they don't like it at all. So my husband was an extrovert. Like, what is what are you talking about?
Am I just a brain in a vat now? Like what's happening now? So I think it depends on where you are in this extrovert, introvert scale.
We should also reveal to the audience that Tina has the good or bad fortune of being married to a philosopher.
Indeed, indeed. For 30 plus years, it's been fantastic.
So, yeah, so the evolution, I mean, I was going to get that later, but it's so good. We have to talk about it now. Co-evolution of humans and AI. And my guess was when I heard that phrase, we were thinking more about cultural evolution. evolution, right? Memes more than genes. But of course, they're interconnected with each other.
Now that you say it, it's obvious because our cultural effects of our behavior, our behavior affects how we pass genes on to the next generation. So AI is going to be affecting the population genome of human beings.
Yeah, and I think in particular with, for example, generative AI as it's generating content, whether it's text or video or images, there's this notion in the late Dan Dennett, who you had on your podcast, very famous cognitive scientist, called these generative AI models counterfeit people. He had an Atlantic article a few years back about it.
And also because people treat these generative AI systems, these counterfeit people, as if they're more objective somehow. They know more than me. People tend to give their agency to them. And also these AI systems evolve faster than us. And so it's not quite clear, not that it's a race, But it's that they're evolving a lot quicker.
Their objective functions are different, like attention, money, et cetera, than perhaps the objective function of people, like maybe the good of the society or public good or something else than just like money or some like GDP or some measure like that.
A lot of us start the new year saying that we will learn a new language, but it's hard to actually commit to it. Babbel makes it easy to learn one in less time than you think. Babbel's quick 10-minute lessons, handcrafted by over 200 language experts, get you to begin speaking your new language in three weeks or whatever pace you choose.
And because conversing is the key to really understanding each other in new languages, Babbel is designed using practical, real-world conversations. What I love about Babbel is you can either dive in deeply and truly get fluent, or you can just master some of the basics before going on a trip. So let's get more of you talking in a new language.
Babbel is gifting our listeners 60% off subscriptions at babbel.com slash mindscape. Get up to 60% off at babbel.com slash mindscape, spelled B-A-B-B-E-L dot com slash mindscape. That's babbel.com slash mindscape. Rules and restrictions may apply.
Are we good enough that we could at least imagine some kind of new equilibria that we get into when we're tightly coupled with our AIs that, you know, there is some happier state of being we could at least aim for if we're working together well? Or is it too much in flux these days to know much about that?
I think these days it's too much in flux. But I think, for example, there are certain things that can be done to improve it. Whenever you or another human being asks me a question, perhaps I would come back with another question. I'm like, did you mean this, Sean? Or did you mean that, right?
But for example, with child GPT or these large language models, they never come back and say, like, did you mean this? The reason is that it reduces their utility, right? Me as a human being, when I ask the question, I want an answer and I want it now. Yeah. Right. Or like it never comes back and says, I don't know or I'm not sure of it.
And maybe you would accept that from a human being, but you don't accept it from a large language model. You're like, oh, you're a tool. You need to tell me. Like I asked you about this and I want the answer now. And, you know, and so there's some of that going on. But like the big tech companies could add those features. to make it more equal in terms of this conversation that is going on.
But at this point, utility is winning over all these other things.
But utility is tricky. I was talking with ChatGPT or whatever the other day, and I was trying to get it to imagine And maybe I didn't try too hard. I didn't really put that much effort into it. But I was trying to imagine a character in a fictional narrative who was very insulting and who would give out some good insults. And I said, what are some good insults that I could give out?
But it wouldn't tell me. It's like, oh, no, you shouldn't give out insults. You should talk to people politely. It's clearly programmed not to go down that road.
Yes, there are actually other generative AI systems, especially for programming, that I've heard where, like, it tells you, like, okay, if you want to code X, this is how you code it. And then you code it and you're like, oh, it didn't work. You're stupid to the generative AI. Like, the human says you're stupid. And then the generative AI says to the human, you're not a good programmer. Yeah.
You know, so then there's some kind of a, you know, then they get at it. Gets in a loop. But that's only like for, you know, specific ones. You're absolutely right. With chat GPT, it's not going to be that kind of antagonistic.
And I know, I mean, this is probably related to the big worry that a lot of people have had about bias in AI algorithms. I mean, if you've trained AI, Well, if you train AI on human discourse and human beings are biased, then of course the algorithm is going to be biased. It's not because the computer is biased. It's because you've trained it on data that is.
And is that something that your tools can help us deal with?
I mean, you can try to find biases. I mean, there's a lot of work in that, like these large language models are sexist, misogynist. We wrote a report for UNESCO for last year's International Women's Day about how sexist and misogynist these large language models are. Um, the problem was that is whenever like I get, uh, somebody asks me that question that, oh, well look, humans are biased too.
The problem is that I can hold a human accountable. I can sue a human being. Who am I going to sue? You know what I mean? And especially in America, we're very litigious. And so then this gets into accountability. And in fact, there's a lot of work in the government.
For example, our government is putting a lot of our tax dollars into like trustworthy machine learning, trustworthy AI, et cetera, et cetera. And to me, it rings a little hollow because there's no accountability. Like, how can I trust you if there's no accountability? I feel like they go hand to hand. And so there's some of that going on, which is like, You know, who am I going to sue?
Am I going to sue OpenAI because it's sexist and misogynist? Like one of its products is sexist and misogynist. You know, that's not the case right now.
Well, and human beings, I mean, this is an ongoing cultural flashpoint. So, I mean, there's a lot of different opinions about it. But human beings might at some point think of something to say that we know is inappropriate. And then we're smart enough or we have enough controls that we don't say it.
Is that a kind of thing that it makes sense to try to implement in the context of a large language model?
Perhaps, right? The thing is, at this point, what it gives out is what's the most probable and what it believes you will like, right? So it's a two-place function, what's probable and what you will like. But yes, you could definitely do that.
And there's this comedian, unfortunately, I forget his name now, but he was saying the secret to a long marriage is to never say what comes to your mind first or second. Always say the The third thing that comes to your mind, right? And this goes back to what you were just saying. Maybe you should just say this third thing, the third most probable thing.
And in fact, along those lines, usually the students who use these generative AI tools for like math problems, math homeworks, the first answer is usually wrong because a lot of the answers that have been uploaded into like Course Hero, et cetera, et cetera, they're wrong. Usually it's the second answer that's the correct answer.
Oh, that's very interesting. Is that actually true or is that like a feeling that people have?
These are just anecdotal, right? Like I haven't had anybody do like a systemic study of this, but that like usually the first answer is not quite there, right?
Well, it's interesting because one of the things we discover, you discover, we in the royal we, thinking about these very, very large data sets is a sort of sometimes you can predict even more than maybe you thought you'd be able to. I mean, I want to ask you about this paper that you wrote about using sequences of life events to predict human lives. That sounds interesting, but also maybe scary.
Yeah.
Yeah. So in the true computer science, AI, machine learning sense, we're very good at coming up with names for our system. So we called it Life2Vec. So we're just putting your life into a vector space, whether you like it or not.
Yeah, that's okay.
But you're just a vector in this vector space. Now, basically, the idea is that if you look at these large language models, right, so they're analyzing sequences. And so as human beings, we also have a life story. That's a sequence. Right. And so I was lucky enough to work with a group of scientists in Denmark.
So if America has a surveillance capitalism, in Denmark they have surveillance socialism. So there is a department there, Department of Statistics, they call it, like Ministry of Statistics that collects information about people. And so we had information for about 6 million people who have lived in Denmark from 2008 to 2020.
And we were like, well, can we write stories for these people in a way and then feed it to what is the heart of these large language models, a transformer model, which is basically just the architecture of a neural network that learns association weights for within some context window.
um and that's what we did so but instead of so for example chat gpt goes online and gobbles up all this bad data that that or that people have put in all the misogynistic sexist data we didn't do that so we had very good data from this department of statistics and we created our own artificial symbolic language
And then we fit that artificial symbolic language for these six million people into a transformer model. And then we were able to predict life events. And so one of them that caught the media's eye was, will somebody between the age of 35 and 65 pass away in the next four years? And we picked that age range because that's a harder age range to predict for.
Like if you're over 65, then it's easier to predict whether you're going to pass away in the next four years. And if you're younger than 35, it's also easy. The other, right, you're unlikely to pass away. And so that's one of the things. The other prediction task was like, will you leave Denmark? You know, so then you can predict for that.
But it had this similar technology as these large language models, which is like you have this one, what they call like predefined, where you just learn based on the data that you have what's likely to happen next. And then you fine tune it for whatever prediction task that you have.
What does it mean in artificial symbolic language? Like literally a human language or it's like some logical encoding?
It's a logical encoding because the data that the Department of Statistics has in Denmark is all tables. So it is not like this kind of sequence. So then you could say, like, Tina was born in Copenhagen in December, blah, blah, blah, right? And we could generate a natural language, but that's difficult. Why would we do that?
So then we generated a vocabulary for this artificial symbolic language, and then we And that was actually a lot of the intellectual property of the work is like, okay, well, how do you take these tables and then create this artificial symbolic language that then you can give to a transformer model?
And what's the answer? Are we likely to die if we're 38 years old? How do we don't?
Well, the thing that we found, which was very interesting, I think, so like the accuracy in terms of the model was about like 78%, et cetera. And I think that's why people were showing a lot of interest in it. But to me, that wasn't really the takeaway.
The takeaway actually was that labor data is a very good indication of whether somebody in that age range is going to pass away in the next four years or not, because health data is very noisy and inconsistent. So even in Denmark, where they have universal health care, it's not like everybody goes to the doctor all the time and you have good data for them.
Right.
And then the other stuff was basically just which sector you were working in. Right. So if you're like an electrician. It's a bad thing. It's not a very good thing. Right. As opposed to like an office worker. So the labor data was actually very, very helpful than the health data.
How important is it to extract causality from these relationships? Like maybe riskier-minded people just become electricians.
Yeah, maybe. Yeah, we didn't do any kind of causal stuff, right? Like a lot of the work, a lot of the hype that's happening now in AI and machine learning, they're all on the correlation side, not on the causation side. So we didn't look at that at all about what causes what. That's very difficult. And I haven't touched the field of causation in part because I'm married to a philosopher.
And so it's like, no, I ain't going there.
Because every time I try to approach the topic, I just heard nightmares. And so I haven't gone that way yet.
There are some issues there. Yeah, no, absolutely. But I guess, I mean, it's interesting. Is it too much to draw a general lesson that... By looking at these large data sets, we might find simpler indications of what we're looking for than we expected. You might have said, okay, how many calories is somebody ingesting is the important thing to look at.
But then you look at the data and you learn, no, what is their job? That's what's the important thing to look at.
Yeah, I think that there's some of that. I think the best way of using this is perhaps government policy. Right. When government issues a policy and then like maybe 20 years from that, you have if you have good data, you could see, OK, what has been some of the correlations that have come about based on this policy?
And then maybe, you know, the actual social scientists and political scientists can then draw some causal diagrams from what we find. Because the one thing is, Usually like from the computer science, AI, machine learning, we treat causation and correlation as if binary, right? As it's like a coin this way or that way. But that is really not the case, right? It's more of a spectrum.
And so if you have a model that is producing robust predictions, there is some underlying causal model. You just don't know it. And then maybe that could steer you into the right direction. for that kind of work. But we didn't look at that for this particular work.
So human beings, of course, are examples of complex systems themselves. But this raises the larger question of human beings will eventually die for whatever reason. Complex systems have their lifespans, right? Or maybe they're infinite. I don't know. But they can also change dramatically and die. And that's something else you're interested in trying to tease out in a general way.
Yeah, I am very interested in the feedback that we were talking about and how do we capture that feedback between, for example, when I go and I'm using Amazon and Amazon is making me these recommendations and then I buy things, I tell my friends and then all of that data goes back into Amazon and how much does my contributions or my friends' contributions amplifying what Amazon is doing?
And so there's some of that going on. And then there's also in terms of like society is a complex system and the place of these tools in these systems. So the tools that help us spread misinformation and disinformation make our society unstable in that then you're not quite sure. what you are reading is true or not, right?
So right now with the fires in LA, there's a lot of misinformation and disinformation going on. And it's like, Who do I believe? And maybe like you believe LA Times and you believe, you know, what you read in CA.gov and so on and so forth, but not what you're seeing on Instagram.
And so there's this notion of the place of these AI tools within our society and whether they're making our society better or worse. And by better or worse here, I mean stable versus not stable, more chaotic. And I think we can all agree that we would like to live in societies that are more stable than not, right? So there's some of that that is going on.
And I have a new project along those lines, which actually touches on philosophy, which is called epistemic instability, which is what are some stability conditions of what you know? So if you genuinely know that whales are mammals, no matter what I show you, perhaps I won't be able to convince you that a whale laid an egg. You're like a whale is a mammal and mammals do not lay eggs. Right.
And you're very sure about it. Right. But then you start talking to me and to chat GPT. And maybe if you don't know something, then you're like, as, as well as you thought, right. Then I, then you're malleable. Right. Then I can like change your mind. And then now you have groups of people who are talking to these within themselves and with, These generative AI tools.
And then basically you go from like individual to groups to this hypergraph notion. And what I'm interested in is when our phase transitions in this hypergraph in terms of what the society believe, like maybe the society believe that vaccines are good. Right. And now all of a sudden the society doesn't believe the vaccines are good.
And what are the leading indicators of those kinds of phase transitions in our society as it's being modeled by conversations formally represented as these hypergraphs?
Yeah, I mean, I guess that's a good example. I hadn't quite thought of the vaccine thing yet. The traditional example that I hear for sort of a social phase transition is opinions about gay marriage, right, where it was universally against. It somewhat rapidly changed to generally for.
But this is – the vaccine stuff is more subtle, right, because it's not that the whole society has gone against them but about half or whatever, right? There's this political polarization and there's sort of more than one –
consensus being built up is that is that just my impression or is there some idea that the modern informational ecosystem lets us have these sub larger sub communities where they have their own sets of beliefs different from other communities yeah i think it's the second one in that like in the past when you did have people that tend to be on the fringe they would people wouldn't hear them
But now, even if you're on the fringe, because of the information technology that we have, you can connect to other people who are on the fringe and then you believe, oh, no, we're bigger than the fringe. We're actually in the middle. Right. And then that kind of thing spreads. Right. So that is one of the things I'm interested in.
Regarding gay marriage, one of the things that was interesting is I was talking to a philosopher who I just taught for a very long time at the Ohio State University, and he was teaching ethics and issues related to gay marriage and abortion, et cetera.
And he was saying that with gay marriage, similar to what you were saying, he saw a shift in terms of opinions for or against gay marriage, mostly for, but he didn't see any change when it came to abortion. And I think that had to do with the vagueness of when is, let's call the thing a baby, right? When is the actual fetus a baby or whatever, you know?
And so, and that vagueness, because like we could all agree that maybe like the day before you're about to give birth, obviously you're not going to do anything. We all believe it's a baby. But that vagueness is something that doesn't shift the opinion on abortion so much for or against. And I like that vagueness aspect of it.
So there are certain things that are vague and maybe you will never have that kind of phase transition. And then there are certain things like the vaccine where like there are people on the fringe that our information technology allows them to connect to each other. And so it feels like a bigger thing.
And then maybe there are other aspects of information that really do make people change their mind just based on talking to other people. And so they're not as sure or as stable in their knowledge.
So I like the hypothesis that the vagueness of the proposition makes it harder to have a phase transition. How would we test that hypothesis? Is that something that we can sort of sift through the data and figure out whether or not that's on the right track?
So it's a work in progress right now for us on this. I'm trying to stay away from making it a psychology or a social science problem because then you get all these confounding factors. And that's what I said. It has more tentacles to philosophy. So in terms of what people ought to do in terms of their knowledge and how sure they are of their knowledge.
And so right now, the way that we're representing psychology, the knowledge or like what, you know, these things as vectors. Cause I'm a computer scientist.
Everything's a vector. It's okay. It's all in your algebra.
Basically how much does this vector space move in one direction versus another? So as you talk with others, so you can build these like kind of simulations, right? Not kind of, you can build these simulations in terms of, in terms of conversations and see how much the vector space shifts.
So, I mean, one thing about complex systems is they can survive a long time. Like the human body, you know, fends off attacks pretty well because it's complex enough to catch things. The other thing is that they can sort of go into this wild negative – positive feedback loop, I guess, and crash, right? Like the economy or something like that.
So is this something – maybe this question is too vague, but is – Are we learning general purpose lessons about complex systems concerning what features they need to be stable versus what features they need to be delicate?
Yeah. So there's a book by Ladyman and Wiesner. And I know that you had James Ladyman on your podcast as well. He's a philosopher at Bristol. And Caroline Wiesner is a mathematician at Potsdam now.
uh about what is a complex system and their book uh that came out i think in 2020 talked about complex systems in terms of features and how there are certain like necessary features and there are certain like emergent features and then there's some functional features where like for example our human brain is a complex system and as you were saying like if it has a shock it adapts and it still perhaps can function unless the shock is like catastrophic
And so what we are not seeing, if we tie this to, for example, the AI models and how they are operating within this system, is we don't know even the role of this AI system, like how much instability is it causing in the system, right? How much feedback is it causing in the system? How much memory does it have? Right. Because they're evolving so quickly that it's not it's not quite clear.
So this is like an open area of study of like going through these different features of a complex system and trying to see, OK, well, how do I measure it for, let's say, a chat GPT? Right.
Yeah.
In fact, a lot of people say, oh, well, you know, it doesn't have a good memory. based on like what I told it yesterday kind of a thing, right? So memory is one of those features that a complex system has.
Okay, so I guess, you know, and one of the important applications here that you have talked about explicitly is democracy, right? Democracy is a complex system and democracies do fail sometimes. And I guess one way of putting the worry is that, or at least the interest, is that the introduction of AI as a new feature in some sense, opens the possibility of a new instability.
It could lead to sort of a runaway disaster that destroys democracy, not to put it in too alarmist terms.
Yeah, I think where it comes in, in fact, this is how it links to my new project on epistemic instability, is that it introduces epistemic instability, right? Like when my dad was getting his PhD in America back in the 60s, the most trusted man in America was Walter Cronkite, right? If he said something, you believed him. Now we don't have such a thing, right?
We don't have a person or an institution where you say, okay, I read it here and I believe it. And then there's also like, depending on where you are on the left or the right, you're like, maybe you believe New York Times, you believe Fox News. And so because of that, I feel like one of the things that we need to do if we value our democracy is teach our kids critical thinking, right?
It's just like, Don't believe what you read or what you hear. Question it, right? Does it make sense? Talk to different people and make your own decision and don't give up your agency. But that's a hard task, right? Thinking is not easy and people don't want to think in the age of TikTok.
Well, is that true? I mean, maybe it is true. I'm certainly willing to believe that's true. But again, I always worry about comparing eras, right? Because I was a different person in the 70s, and the 70s were also a different time. But I don't know what things are common between different eras and things are not.
Like, did we really want to think more back in the 1970s than we did in the TikTok era? I don't know.
I think there was less distraction for sure, right, than it is now. I think the dopamine hit that we get by just scrolling through Instagram, TikTok, et cetera, is something that has been studied. And, you know, I'm not a psychologist or a cognitive scientist, but that people, it's just like you let your brain go to mush and you just like spend hours on it instead of,
maybe actually sitting quietly and thinking about a problem, you know, it's boring, you know?
Yeah. Okay, good. So this is another aspect. So, okay. That's actually nice. Despite not really trying to, I think that I see a bunch of threads coming together here. Like, uh, Technology broadly, not just AI, is giving us new ways to fulfill our own objective functions. Maybe it's a dopamine hit or whatever. But its objective function might not be ultimately our flourishing.
So there's an absolutely danger mode there.
Yeah, in fact, that's such a perfect thing. I always say to my students, what is your objective function? Because we all have an objective function, and that objective function changes over time. And perhaps if all of us just think, okay, did my objective function change from yesterday or from last month or whatever? You know, it would be helpful for society.
So as a computer scientist, as a machine learning person, I always think about objective functions. And in fact, I cannot look at a mountain range now and not think, OK, if you drop me there, will I find the peak or not? The global peak? Probably not. But, you know, like, please drop me at a nice place.
You've co-evolved with your network. That makes perfect sense to me. Yeah.
So the gradient is with me.
Exactly. Exactly right. So, okay. So you've said many things about this already, but I just want to get it as clear as possible. The trust, the community of trust idea that is so central to a democracy is one of the things that is in danger of being undermined by AI, right? Like you probably saw the story about Instagram having its AI accounts.
The sassy black lesbian lady who was programmed by a bunch of people who are neither black nor lesbian and just pure AI. And that one was admitted, right? Like they said that was AI. And do you personally worry that people are just going to mostly become friends with non-existent human beings in the long term?
I mean, as an introvert, I'm fine with it. But yeah, no, I think we see this in society now where like people aren't,
um as good as interacting with other people or they're not as not as courteous to other people perhaps as before i don't know maybe i'm out of an age now where i'm like oh yeah people are not as courteous as they were before um but you know the more you interact with people the better you get at them unless you interact with them the worse you get at them and so if we don't put a premium on like oh look like tina can't actually pick up the
As opposed to just sending a zillion emails or text messages. I think there's a value to that. And I think there is this notion of trust. Even the most introvert among us, there are a few people that we do trust. And so if it comes to a point where you trust an AI system that we don't know how it works and that it's vulnerable to attacks, then that is a problem, right?
And so, in fact, this gets us to this phrase called the red teaming that we hear all the time now that, oh, well, don't worry about it. They will red team it. And so the phrase red teaming came from the Cold War era, right? So the Soviet Union, the red team, America, the blue team, right? So, and there was a lot of this red team, blue teaming, for example, for cybersecurity, right?
But this phrase red teaming is not well defined when it comes to these generative AI systems. And my friend and colleague, Professor Hoda Hedari at Carnegie Mellon has written extensively about this because there's no guarantee, right? So you cannot guarantee that somebody cannot jailbreak chat GPT. And jailbreaking is basically that chat GPT has put in some information kind of guardrails, right?
Like you shouldn't, it shouldn't tell you how to like rob a bank, but you can jailbreak that and it will tell you how to rob a bank, right? But there's no guarantees. It's not like, oh, here's a theorem, the proof, QED, go home. You cannot jailbreak this.
And so if you're getting all of your information from these AI systems that we know can be manipulated and we don't know how they exactly work, then you may not have a shared reality. with other citizens. And that's, I think, the worst for democracy. We really do need a shared reality to be able to withstand our democracy, to hold it and not lose it.
So how do we get that? What do we do? This sounds very scary, but I'm not quite sure what to do about it.
Well, I guess as a professor, to me, it's education. I think actually educating The public, and I spend a lot of my time educating the general public and not just the students at my university, but educating the public about how these tools work, what they're good at, what they're not good at, not giving their agency to these tools and critical thinking skills. I think that that's the way forward.
But the problem with that, of course, is that the value of getting an education is also susceptible to this loss of trust. I don't know if you saw the recent – people were getting upset because there was a poll that showed that young men were becoming less and less interested in going to college.
But then someone else pointed out that if you go into the crosstabs, if you look at other questions that were asked, there's actually no relationship between male and female versus going to college. It's all about Republican versus Democrat.
It's a Simpsons paradox kind of thing where most of the young Republicans are male, and those are the ones who have become very polarized against wanting to go to college. So that's a – That's part of the problem you've been talking about, right? Like there's a whole new epistemic community out there that is forming and it seems to be solidifying over time.
Yeah. Perhaps we should think about how we educate people and maybe they'll see the value of education, right? And that education is about enlightenment. Education is about empowering yourself, right? So education isn't like a teacher just pouring water knowledge into your head. It's about you learning about the world and so you could do better in the world.
As a teacher, I'm already 11 on my guitar, right? I just want you to do better. And if you do better, then I will also do better. The society will do better and we will all do better, right? And so I think part of that is maybe we should rethink about how we sell education.
Do you think that AI and associated technologies can be a force for good in education?
Yeah, I think so. I mean, there are certain things that I have I have heard. So, for example, now there's some privacy aspects to this.
But if you are a college and you are tracking how students are doing on their homework, et cetera, and let's say Tina took calculus and she didn't do very well on differential equations and now she's taking machine learning and they're going to talk about differential equations.
that you could tell Tina, oh, you know, maybe you should go brush up on differential equations because they're going to talk about differential equations. Yeah, okay. You know, so there's some of that kind of a thing to like to help you. And then there's also like basically like personalized tutoring that I think AI can be helpful there.
Do you yourself use ChatGPT or something equivalent to help figure things out, to learn about things?
I use it for fun. Give me a bio of Sean Carroll in the King James style. I don't use it. I haven't used it for any real work stuff or anything. I don't trust it. That's the problem.
You don't trust it. I certainly don't trust it. But sometimes I did realize that there was a good use case because I was trying to understand... You know, in mathematical things, they will often tell you true things, but you don't understand what the point of it is. Right. And I was trying to understand type three von Neumann algebras.
And so I asked and I got chat GPT to explain to me not just what the definition was, but why it was important in this particular case. And that was actually very helpful. Yeah.
Oh, that's great. Yeah, I asked it some stuff about linear algebra and matrix norms, and it was really bad at it. And I was like, wait, what? Like, there's so much about linear algebra. In the world, you should know about matrix norms.
That's the problem. There's too much. Like you just said, there's too much junk out there. In some sense, if you get technical enough that it knows about it, but not so technical, all the stuff that's been written about it is sensible. Like, no one's going to make up stuff about type 3 von Neumann algebras. What would be the point? Exactly.
yeah yeah so so i guess maybe the the point is let's not teach linear algebra to kids and then no no no because the whole of machine learning is basically it's all linear algebra and like quantum mechanics also so yeah linear algebra kids that's that's your lesson for today from mindscape learn more linear algebra i think it's the key to everything yeah exactly exactly but it's very good at like um
Basically, admin stuff. So if you show it some picture of Google Scholar, put it into BibTeX, put these references into BibTeX, it does it for you. So some of those kind of admin stuff it's good at.
Yeah, I think that the weird thing is we're trying to use it for creative work, whereas the most obvious use case is for the least creative things that we don't want to do.
Indeed. Indeed.
All right. It's all very complex, and it's evolving, and there's a lot of degrees of freedom. So Tina Eliassi-Rod, thanks very much for helping us all figure it out.
Thank you. Thank you for having me on, Sean.