Riskgaming

Gary Marcus on AI and ChatGPT

Artificial intelligence has become ambient in our daily lives, scooting us from place to place with turn-by-turn navigation, assisting us with reminders and alarms, and guiding professionals from lawyers and doctors to reaching the best possible decisions with the data they have on hand. Domain-specific AI has also mastered everything from games like Chess and Go to the complicated science of protein folding.

Since the debut of ChatGPT in November by OpenAI however, we have seen a volcanic interest in what generative AI can do across text, audio and video. Within just a few weeks, ChatGPT reached 100 million users — arguably the fastest ever for a new product. What are its capabilities and perhaps most importantly given the feverish excitement of this new technology, what are its limitations? We turn to a stalwart of AI criticism, Gary Marcus, to explore more.

Marcus is professor emeritus of psychology and neural science at New York University and the founder of machine learning startup Geometric Intelligence, which sold to Uber in 2016. He has been a fervent contrarian on many aspects of our current AI craze, the topic at the heart of his most recent book, Rebooting AI. Unlike most modern AI specialists, he is less enthusiastic about the statistical methods that underlie approaches like deep learning and is instead a forceful advocate for returning — at least partially — to the symbolic methods that the AI field has traditionally explored.

In today’s episode of “Securities”, we’re going to talk about the challenges of truth and veracity in the context of fake content driven by tools like Galactica; pose the first ChatGPT written question to Marcus; talk about how much we can rely on AI generated answers; discuss the future of artificial general intelligence; and finally, understand why Marcus thinks AI is not going to be a universal solvent for all human problems.

Episode Produced by ⁠⁠⁠⁠⁠⁠Christopher Gates⁠⁠⁠⁠⁠⁠

Music by ⁠⁠⁠⁠⁠⁠George Ko⁠⁠⁠⁠⁠⁠

continue
reading

Transcript

This is a human-generated transcript, however, it has not been verified for accuracy.

Danny Crichton:
Gary's going to say something interesting at any point now. I don't want to lose it.

Gary Marcus:
Always leave the recorder going even after you say goodbye. Thank you very much. It was great talking to you. And by the way, you stabbed the guy. Let me get that. Oh, no. I turned off the recorder.

Danny Crichton:
Hello and welcome to Securities, a podcast and newsletter focus on science, technology, finance, and the human condition. I'm your host, Danny Crichton. And joining me is Alexa's own Josh Wolf, along with a very special guest today. Gary Marcus is professor emeritus of psychology and neuroscience at New York University and the founder of machine learning startup Geometric Intelligence, which sold to Uber in 2016. Marcus has been a fervent contrarian on many aspects of our current AI craze, the topic at the heart of his most recent book, Rebooting AI. Unlike most modern AI specialists, he is less enthusiastic about the statistical methods that underlie artificial intelligence approaches like deep learning, and is instead a forceful advocate for returning, at least partially, to the symbolic methods that the field has traditionally explored.

Today, Gary, Josh, and I are going to talk about the challenges of truth and veracity in the context of the explosion of fake content driven by tools like Galactica, pose the first ChatGPT-written question to Gary, talk about how much we can rely on AI-generated answers, discuss the future of artificial general intelligence, and finally, understand why Gary thinks AI is not going to be a universal solvent for all human problems. Now let's get started with Josh and make sure the tape recorder's on.

Wait. Before we even get into the cool, important topic, we were just talking before we hit record, and it was this idea that we're looking at each other on a video screen. People are listening to us right now. But it is only a matter of time until AI is sophisticated enough to generate a podcast and maybe even the video content of the voices correlated to it, and nobody will know the veracity of it. You had this fake attempt between Steve Jobs and Joe Rogan, which people watched. It was sort of okay, not too bad.

Gary Marcus:
It was that, and Reid Hoffman is actually doing some ChatGPT podcasts. And there's a character in them called Gary Marcus, which apparently says things that Gary Marcus might say, but without the sophistication and nuance. You get the real thing, whereas Reid Hoffman had to settle for second-rate, banal, lying sack of statistics.

Danny Crichton:
How will we know? Because this is one of my big, burning questions. And that question, is you go back 20 years with the democratization of the ability to produce content. Danny could write a blog. He could launch a million articles on TechCrunch. I could do a blog. Everybody was publishing content. And the thing that became abundant was the text that was being produced, and the thing that was scarce was the ability to find that needle in the haystack, the ergo search. And whether it was going to be Yahoo or Lookout or AltaVista or ultimately Google, that was the value. Today, the ability to produce content of all kinds has been democratized. And not just true content or opinionated content, but fake content. And it feels to me like the veracity is the scarce thing, to be able to detect truth.

Gary Marcus:
That is 100% of what keeps me up at night right now.

Danny Crichton:
How do I know you're telling the truth?

Gary Marcus:
That's right. You don't. You are guessing that I'm not a simulacrum. Maybe it doesn't matter whether I'm real or not. What I think you should be worried about if you care about democracy is that the world is about to be inundated in high volume bullshit, some of which will be made by individuals. But you should especially be worried about the wholesale thing. So we've always had retail bullshit. Somebody writes a blog post that's just not true. But now, these tools allow you to write as many copies and variations on themes as you want, as fast as you want. So let's say that you would like to persuade people that vaccines are bad and you shouldn't take them. Well, you can ask one of these things.

You might want to evade the guardrails, and we can talk about that in ChatGPT per se. But you use a large language model, maybe a different one like Galactica that you could get on the dark web now, and you can say, "Write me stories about the negative consequences of vaccines and include references," and it will go ahead, and it will make a story about something that was published in the Journal of American Medical Association that said that only 2% of people saw a benefit, which is not true. There's no article. There's no 2%. And then you hit the refresh button, and now you got another article, and it says it was the Lancet, and it was 4%. And you can write a little macro to do this. In a few minutes, you can have 20 or 30 or 50 or 100 stories.

And yes, it's true that if you're not Joe Rogan and you disseminate this, not as many people are going to listen. But if you can disseminate millions of copies, some of them will stick. It's just going to change the dynamics of veracity. And I think this is going to play into the hands of fascism. It's also going to make search engine engine less useful because the chances are that when you look for something, you actually get what's real and not garbage is going to change, and not in a good way.

Danny Crichton:
I asked ChatGPT, what is the most controversial question you would ask Gary Marcus on a podcast? And it said, Gary, in your work as a researcher and entrepreneur in the field of AI and machine learning, you've been known for being critical of deep learning and advocating for more symbolic and rule-based approaches. Given the recent advancements and success of deep learning in various applications, how do you reconcile your stance with the current trend in the industry?

Gary Marcus:
It's not a bad question. I would say that it misrepresents me in a way that I'm frequently misrepresented. It's not unique to Chat. It's by the way the first time Chat's ever proposed a question to me. So there's some landmark in all of that. What I have actually proposed are hybrid systems that are a mix of symbolic and pure deep learning. And it doesn't get that nuance. It thinks that I'm purely advocating for symbolic. Then again, Yann LeCun has made that mistake a few times even after I've corrected him. So it's not unique in the history of society that such a question should be made. How would I reconcile it? I think it's actually an interesting question. What I would say is that we have made a lot of partial progress, but that a lot of partial progress is an illusion if the real goal is artificial general intelligence. What we have now is this interesting technology that it's a jack of all trades but master of none. Nothing, I think, that you can fully count on it.

The best applications are actually things like coding, where you have a professional whose whole job is to debug garbage. This is what you do as a coder. You're like, "Ah, shit. I left out the parenthesis here and it's caused me all these problems an hour later. And now, I have to go back and figure out what that problem was." And so people who code for living, if they're any good at it, are good at debugging. So having some code that is imperfect, they're already accustomed to that and they can work with it. But imagine, let's say you're a journalist and you say, "I'm going to get Chat to write an article," or I'm not sure it was actually Chat that did this for Cena, but I'll have a large language model. You're not as accustomed to the kinds of errors that it makes and you wind up in trouble.

In general, these systems are not reliable. They're brilliant in one way, which is we've always had very narrow AI. So AlphaGo, you can't go to AlphaGo and say, "Okay. Now I want you to fold proteins." That's actually a different system even though they're kind of nomologically similar or something like that. There are some shared mechanisms. There's a lot of mechanisms that are not shared. Most AI we've ever seen before is like it does one thing. Sometimes does that one thing very well. So navigation system very well gets you from Point A to Point B. Chat will do an impression of anything. It will pretend to debug your code. It will pretend to write a letter. It will pretend to write a biography. But the problem is it does none of it in a trustworthy way.

So I don't know if you've experimented with having Chat write your biography, but most people's experience when they do is it makes up a bunch of plausible stuff. So it makes up a college, but you didn't go to that college. It makes up a research specialty if you're a researcher, but it's like it's in the general area. Lee Cronin posted about this. It made up different areas of chemistry than he actually worked in, some of which weren't really real areas of chemistry or which nobody would describe them that way. Made up references as if he had worked in these areas of chemistry that he hadn't. This is a pretty typical thing. So we have this, as I said, master of none AI. It's kind of general but it's not very good. The way I reconcile my love for partly symbolic systems with the apparent success of these is these things approximate almost anything. But approximating something is not the same thing as solving it.

Danny Crichton:
One of the things that you mentioned before was in some of these responses from the ChatGPT, that they're constructing a bio and they're effectively just bullshitting. And it got me thinking that there's this subtle difference, although a definitional one, between bullshitting and lying. One is attempting to just fake it and persuade somebody. The other is overtly knows that it's not true and is purporting it to be fact when it's not.

Gary Marcus:
If I can interject for a second. As you construct it because there's no intent-

Danny Crichton:
I was bullshitting you about about the difference between bullshitting and lying.

Gary Marcus:
That's right. So these systems are definitely not lying with intent. I tend to think of lying with intent. You could even say that humans bullshit with intent. They're trying to fool other people. And there's no intent from these systems. It's just what they do. So they will lie about COVID and vaccines because they steal that from some human in a database. But also, they can't keep track of the connections between bits of information very well. You could think of them as a very lossy compression scheme. And they uncompress, and all kinds of crap happens. So it's not done with intent. It comes up with something you kind of call bullshit, but it's not trying to fool you. It's just making guesses.

Danny Crichton:
One of the questions I have, obviously one of the big goals here is to get to AGI, artificial general intelligence. And there's been these two directions. You have this very specific, focused, application-driven AI. So you talked about AlphaGo, AlphaFold, where you have these really constrained domains. You have symbolic systems. You can develop rules. And we actually have really high fidelity. AlphaFold has done really, really well based on experiments in the last couple of months. We've seen AlphaGo's performance. But then there was this block because there's just endless numbers of domains that you have to apply AI to develop systems for each of these. And so the whole industry moved to this statistical method, which was Markov chains, probabilities, language. None of the facts are actually built in. It has no foundation whatsoever. Everything is built on sand, essentially. But we seem to have made enormous amounts of progress. Your view, if I understand, is-

Gary Marcus:
Very big sandcastles. Very big sandcastles.

Danny Crichton:
Very big sandcastles. My question is if we try to approach AGI, it seems like we've made a lot more progress on the statistical side over the last couple of years versus the symbolic side, which was at the core of the I revolution. I think Dartmouth 1960s into the '80s went through the winter. When I was doing linguistics in the 2000s, it was considered a little bit of a moribund field, like symbolic systems that reached its endpoint. We didn't know where to go next, and this was the big revolution. Do you think we'd go back that direction? Is there a synthesis that combines the two together, or should we be throwing out the statistical site altogether? Isn't that though just because we have the systems like Go or even protein folding, they're parametrically constrained? There is a correct answer, right answer, versus the latent space of infinite possibilities that we see in art and language and creativity?

Gary Marcus:
So I mean, part of thesis of my 2019 book with Ernie Davis, Rebooting AI, was that we only know how to build narrow AI right now. And we considered the statistical approaches. GPT wasn't popular. But we said these are not in the end going to work. And I stand by that. I think what we said then is true. There's no doubt that symbolic AI on its own failed. There are certain excuses that are worth considering. So there was a lot less data available then. There was a lot less compute then. So it's a little bit of an unfair fight to say that 2023 clusters that use up the energy budget of New Jersey for a week are crushing things that would fit on my Apple Watch with a lot of room to spare. That's not entirely a fair fight. But I still think in the end that pure symbolic AI is probably not going to work, and certainly not without better learning systems.

So another thing we know now that they didn't really know was how important learning is. There wasn't all the data to even think about it. So could be that if we did symbolic AI with better learning algorithms on modern machines with modern amounts of data, you might actually get a lot further than people got in the 1960s. I don't think that's unreasonable to expect. And we do still use some symbolic AI. For example, in turn-by-turn navigation, it's pure symbolic AI. It works great. We use it every day and we love it. So symbolic AI is not as dead as people think it is. And we also have systems at scale that use symbolic AI, and people forget this. Google search until recently was almost pure symbolic AI. And now it's actually hybrid AI. So Google search still has lots and lots of custom rules and symbolic algorithms and stuff like that. And it has things like... what is it called... DeepRank that are neural networks.

I don't think it's the principled synthesis that we need. I think Google search, from what I understand, has been a lot of throw and see what sticks. And it's not a fundamental research insight, and maybe we need one, but it is actually a point in favor of the synthesis.

Josh Wolfe:
Gary and I have great riffs on these things when he's talking about startup ideas. And we had a great one years ago, where we were talking about the problem with robots, and that robots particularly like the Roombas would go into a room, and they would have the "intention" to go clean a room. But if there was dog shit there, they were going and spreading that shit everywhere around like a pancake, like a Nutella.

Gary Marcus:
There's actually word for that, the poopocalypse.

Josh Wolfe:
So I proposed... Gary was thinking about how do we get robots to do all kinds of more sophisticated stuff and actually navigate the world not in this plain or two-dimensional or three-dimensional way? And so I said one key metric would be cool would be the MTTFU, I think. Is that what I called it? Do you remember what that was?

Gary Marcus:
It was the mean time between fuck-ups. Exactly. Which is still should be the number one measurement in this world.

Danny Crichton:
So much of what we've been talking about has been text-based and search retrieval and being able to generate and the validity or veracity of what is generated. Let's talk for a moment about some of the other creative fields, particularly art and music. And in both cases, you could make the case that art feels totally unconstrained. Anything could be art. People can create surrealism, geometric designs. They can change color palettes. There's recognizable artists, but it feels like it's infinite variations. Music, obviously there's constraints, but same sort of thing. You're starting to see debates or you have been seeing debates about are computers creative, their ability to actually generate stuff. Obviously it still requires human prompts. The more sophisticated the prompts you can make, the argument the more constrained it is. But I've been just utterly feeling awe looking at some of the results, whether it's from Midjourney or Dolly or Stability or whatever comes next.

And the same thing. I can see in the same way early YouTube videos were 240, and then they got 480, and then 4K, that the re-fusion, some of the models for music generation, they start out really tinny and crappy. But it feels like it's a matter of time before I'm able to say, "Give me Life of Agony, a heavy metal band I like, playing a Bob Marley song," and it will generate that.

Gary Marcus:
I think we will get there. I mean, it was Microsoft Songsmith, I don't know if you remember back in the day, and it tried to do this stuff. And it sounded pretty lousy in the end. There's definitely been progress on the music domain. And I think musicians in particular, music producers in particular, have a long history of saying whatever new technology is there, let's use it. I mean, so electric guitar, we've never used that before. Let's see what we can do with it. The amplifier, well, what if we blow it out, turn it up to 11? Synthesizers. In fact, harmony is a technology. People didn't always know all of what we know about Western music harmony. It was invented and people adopted it. And what I expect to happen with music tech is that people are going to use these tools.

I think in the beginning, they're not going to do a song end-to-end. When I was a kid, I had this dream of having a station that would just automatically generate music for me with no intervention. Now you don't even need that because you just automatically pick music that human musicians have made, and it's fine. But what I think we will see is people taking something like there's a new system from Google where you type in text and it produces something. I think it's just pure audio. A nicer version might give you MIDI files too. But people will take that and they'll work with it. They'll use it in the same way you use a sample. And why not? I think there's a whole question around appropriation anytime you use samples, and there's a question here where these systems are doing appropriations. There's a lot of questions about how the original artist should be reimbursed when the stuff is used. Complicated questions.

But from the perspective of let's say a producer who wants to make a song, why not try this out? It's sort of like a tool for brainstorming. You're probably going to end up wanting to add your own vocal track because you're not going to get the expressiveness and the emotion that you would get from a human. And so you're probably going to add to it. But why not? And I think we'll get interesting new music out of that. Although I will say so far, most of what I've heard is accomplished but dull. Sounds like it's been professionally produced at some level, but isn't very interesting in another because fundamentally, what these systems mostly do is they predict what's plausible given something. And so that tends to keep them inside the box and not outside the box. They're not going to figure out that conceptually, you could think of a urinal as being art if you put it in a context. That takes a human, I think, to think about that. But within some parameters, as you say, the good exploring those spaces.

Danny Crichton:
What else have we not asked you? What are the big, controversial topics right now that you want to weigh in on that people are buzzing about the wrong thing, convened around the hype around something, and you're just like want to scream at the masses that are being misled by the preacher? "Don't you see the emperor's got no clothes?" What are the two or three big sacred cows you want to slay?

Gary Marcus:

Well, I mean, if I could pick one, it's just that I would really like people to understand that artificial intelligence is not a universal solvent, that really, what we have are a bunch of different mechanisms. They each have their own strengths and their weaknesses. None of them are that great right now. They each in certain contexts work very well. But for example, I think everybody thinks we have chat search. We just need more data. They don't understand the truth is just orthogonal to how it's built. People just, if they're not professionals in the field, and some even if they are, don't understand that these things are not actually intelligent. They kind of assume that it would just consult the web. They don't understand that that's not built into the architecture, that architectures have properties, that when you design a bridge...

I mean, first of all, the average person doesn't presume that they know anything about building bridges. But the people who do know about particular loads that they have to think about and know the particular materials respond in particular ways, and particular designs have different advantages and disadvantages. You can use this one where you can build a support, but you can't use it in this other place. And AI is like that too. There are assumptions in architectures about what they can do. And we have this problem where people are attributing intelligence to systems that are really quite dumb, and then imagining that they'll just get over their problems when there are inherent architectural limitations. And that's causing a lot of confusion.

Danny Crichton:
I will say, we didn't talk about the article, but you did make a prediction for 2023. So let's do a quick little rundown. You made a prediction for 2023. What was that prediction?

Gary Marcus:
That this would be the first year where there would be a death attributable to a large language model, which is a very dark prediction, but I think it could come to pass.

Danny Crichton:
How? Give me the scenario. What happens? And is it going to be me?

Gary Marcus:
So there are different possibilities. One is-

Danny Crichton:
It's not me, right?

Gary Marcus:
... a large language model gives-

Danny Crichton:
It's not going to be me, right?

Gary Marcus:
I can't make that prediction. I hope not.

Danny Crichton:
Oh, no. I'm terrified.

Gary Marcus:
You just steer clear of the following circumstances. If you do not wish to be the first large language model fatality, then do not take advice from a large language model. Do not eat what it tells you to eat. Do not drink what it tells you to drink. Do not commit suicide if it tells you that that's a good idea. Do not fall in love with a large language model because it might abandon you because it doesn't even know what love is. But somebody who doesn't take any of that advice might take advice from a large language model or develop a relationship and become rejected, feel rejected. And in that way, they might do themselves in either inadvertently or deliberately. And so watch out.

Danny Crichton:
It's terrifying and statistically probable. I mean, if there's a 1 in 10 million chance of some idiot following an AI's advice and there's 10 million idiots, one of them is going to do it.

Gary Marcus:
That's right. I mean, I have no quest... Well, I'll say that I have very little question that what I said is true. My only question is whether we'll know it. Somebody might commit suicide and not leave a note, or they might drink Drano and we don't know what led them to drink the Drano. But we already have, for example, it was Alexa that told a child to stick a penny in an electrical socket, and it happened that that child's mother was there and said, "Don't do that." But we've already had what they call near misses, which is a stupid phrase because it's really near hit. We've had near death experiences from these things already. It is just a matter of time. So I thought it was actually pretty safe prediction. I actually made the prediction before ChatGPT came out, and the guardrails maybe changed the dynamic a little bit. But also, a lot more people are using these things.

And part of my premise was that a lot more people are going to use these systems. They're probabilistic. They're not reliable. And so they are going to give some bad advice. Their cousins have killed people in driverless cars. And now, the text versions probably will do the same, and it's a question of who and when and how many and so forth. So this was more valuable podcast than I realized because it's good advice. Don't listen to your chatbots. They might kill you.

Danny Crichton:
Chris and Danny, this was amazing, the ability to generate a fake Gary Marcus through this entire thing, a Gary Marcus bot. What do you think the real Gary Marcus is going to think of this? I have no idea. I do know-

Gary Marcus:
He's going to think you made him look pretty good. You should edit the coughs in quotes. But otherwise, you made him look pretty good.

Danny Crichton:
I thought that was an authentic touch.

Gary Marcus:
He's mostly satisfied-

Danny Crichton:
I thought that was an authentic touch.

Gary Marcus:
... or so I'm told.

Danny Crichton:
It's an algorithm. We added coughs so that people would think that that Gary Marcus bot was [inaudible 00:23:28] Marcus.

Gary Marcus:
The authenticity. The authenticity.

Danny Crichton:
Well, Gary, thanks so much for joining us.

Gary Marcus:
Gary or Simulated Gary thanks you very much. See you again soon.

Danny Crichton:
See you soon.