Another week, another media tempest in a shrinking tea cup. This time, the internet’s ire centered on Perplexity AI, a startup that offers a layer on top of LLM models that can answer real-time questions about current events. The company got into hot water after it summarized a paywalled Forbes article on Eric Schmidt and his investments in drones with minimal citations. Was this simply fair use summarization of an enterprising investigative article, or something more nefarious and damaging?
We brought a troika of journalists (and former journalists) to talk about the controversy and its implications. First up, Reed Albergotti is technology editor at Semafor and a long-time journalist across The Washington Post, The Information, The Wall Street Journal and elsewhere. Second, Eric Newcomer departed Bloomberg after a distinguished reporting career to start Newcomer, a tech newsletter that’s now complemented by the prominent Cerebral Valley AI conference coming this week in NYC. Finally, host Danny Crichton was formerly managing editor at TechCrunch.
We talk about the norms of journalism and creative work, the economic disruption of creativity by AI, how journalists should adapt to the coming automated world, how legislation might protect these industries and whether the regulatory approach fits the world’s needs, and finally, the limits of knowledge and how much AI still doesn’t know.
Produced by Christopher Gates
Music by George Ko
Transcript
Danny Crichton:
We were talking about Perplexity and why we were so perplexed by their new articles, product. Reed, you had a piece in Semafor. Eric and you got into a little bit of a kerfuffle. It felt great, and I know Chris reached out.
Eric Newcomer:
And I've already been vindicated.
Danny Crichton:
Since then, I think there's been at least 20 iterations of the story.
Eric Newcomer:
Certain [inaudible 00:00:20] have emerged in my favor.
Reed Albergotti:
The pitchforks were warranted. It turns out. Sometimes the pitchforks are... Sometimes the mob is right.
Danny Crichton:
So let's reset the scale. So when we first reached out a week ago, the story was that Forbes had broke a piece of news, and I'm going to pretend like I know exactly what news that was broken, but it was from another major story and they had the story exclusively. And then what people noticed is that-
Eric Newcomer:
It was about Eric Schmidt and drones, right?
Danny Crichton:
Eric Schmidt and drones, which was a follow-up to a previous story. And so it was part of a coverage of some of the work that he was doing. And Perplexity basically posted an article, on something they're calling Perplexity articles, which was basically the exact same story with a very light work set at the bottom that no one could really see. It was a summarization of basically this exclusive article, and of course it's behind a paywall. I think that's an important piece of it, is that this story is not really visible unless you sort of pay for it or you get the one free article through the meter. And so there was a little bit of a scandal, a little bit of a shock, that suddenly this article was with Perplexity, which does not pay these authors who did all this enterprising journalism. And that's been kind of a theme for the last couple of years, and that sort of triggered an avalanche of criticism. Now, Reed, you sort of took that and wrote a big piece in Semafor, so maybe summarize what you were thinking about.
Reed Albergotti:
Yeah, I mean, I saw the mob and I thought, I just wasn't quite sure what the exact issue was. Because I think a lot of people are mixing a bunch of issues together. And so I wanted to take a step back and think about it. So what exactly is the issue? Is it that we think summarizing articles and writing a summary of an article and linking back to that article is wrong? Is that the issue? And if so, okay, that's a thing we should debate. We should talk about that because that's been happening on the internet for a long time. People have sort of embraced that as a way to generate traffic. Or is the issue that summarizing an article, someone else's article and linking back to it is okay, but we don't want an AI model to do it? I think that's an interesting debate that we should have. Do we not want AI to be doing this stuff that humans do?
And I think that was all just being... These are questions that I think need to be answered in this new age of foundation models and large language models, and we should have that discussion. But I think what was happening was kind of an emotional response by a lot of people in the media that were picking on one company, Perplexity, and specifically its founder, this young founder, and making a bunch of assumptions about his motives and all this stuff without really, I think doing the necessary reporting there. And I just thought, this is the kind of thing... I think I've been looking for maybe an opportunity to weigh in on some of what I think is wrong with tech coverage today, and I thought this was kind of an opportunity to do that. And I knew it would make me unpopular with some my shame-
Eric Newcomer:
Shame. Shame.
Reed Albergotti:
I knew I'd be unpopular with some of my good friends like Eric, who I love and respect.
Danny Crichton:
Yeah. So Reed, you posted your piece. And then Eric, you took to the airwaves, AKA Twitter or X, and you had a very different view from where Reed was sort of placing his stake in the ground.
Eric Newcomer:
Yeah, I mean, I called his story nihilism and said, "Media companies steal too much reporting, so let's shrug our shoulders with AI." I mean, basically I thought Reed was letting AI companies, I love you Reed, off the hook. To zoom out here for a second, I think what's happening is Perplexity rips off Forbes' story and a bunch of reporters go boo on Twitter about it, and Reed's like, "Oh, isn't that bias? Reporters shouldn't just be against tech." And my point fundamentally was reporters boo at each other all the time when they rip off each other's stories.
Danny Crichton:
True.
Eric Newcomer:
The shame, which I'm supporting, and the norm defending is the main way that we deal with plagiarism and ripoffs. So AI is just, welcome to the fold, welcome to the arena. This is normal behavior, just enforcing non-economic norms about how much of an article you can steal.
And I think what's really worrying to reporters is that these AI companies aren't subject to the same types of shame and same norms that we have been. And so the great threat is sort of this, we're only thinking economically and legally and not what norms have been established about sharing content online. And so yeah, I loudly and proudly am happy to boo Perplexity for ripping off Forbes' great work. And the last thing I'll say is just that Business Insider, a publication I love, is famous for ripping off stuff, and they get booed. But I do think a key... Business Insider has developed norms about it. The more egregiously a story is a total ripoff of somebody else, they're linking to it all over the place. Perplexity is not meeting, even the worst the worst, the Business Insider standard, of a story ripoff. So people say, and Reed says, "Oh, the media does this," but I'm saying Perplexity out of the gate is behaving worse than the most shameless of aggregators.
Reed Albergotti:
So are you saying, Eric, that you think no one should do a summary of an article and write a blog post about it and link back to that article, that that's a ripoff and it's stealing?
Eric Newcomer:
Inevitably, people are going to aggregate. I'm not saying it's stealing. I'm not making any sort of legal claim. I'm arguing about what norms we should have at Forbes.
Reed Albergotti:
Well, morally stealing. I mean, I think a lot of people have called it stealing, and I don't know if you did, but is it... You've used the term ripping off. Okay, so is that-
Eric Newcomer:
Right. Well, I do think in the view of media on media enforcement of these norms, you get more credit as an outlet if you break stories and are contributing to the discourse. Like The Information, where we both worked, they do an aggregation, they have their sort of round up, but it is a publication that's fundamentally breaking news. And they also make it clear where they got their aggregation from. They're giving credit to reporters who scoop stuff. They're engaged in the idea of it matters who got something first. And I think what was so egregious about the Perplexity situation is that Forbes had a unique story. A story that's actually hard for other publications to deal with is if you're the New York Times and not Business Insider, you need to actually match the facts for the most part if you're going to re-report it. You're Business Insider. You're willing to run a version of it, but you're going to heavily attribute most paragraphs very high up. It's going to be clear that everything you're saying is based on the story.
Perplexity just ignores all those norms and says, "Oh, this is a novel story and we're going to run it like it's authoritative fact," when really it's one outlet's story that was basically backed up by the fact that other people re-aggregated it. And so the way that it treated the story was so out of the norm of how media would treat other types of content. And I think those norms are good, and I would support stronger norms about limits on aggregating and sharing, especially content that's behind a paywall.
Reed Albergotti:
I mean, I think there's definitely... And I said in my piece that I thought the way that Perplexity structured that article was lacking in a lot of ways. It didn't give enough credit. It didn't wink enough or what... It should have structured it differently. I wrote my piece a couple of days after the Forbes article, and already, Perplexity had completely changed the way that they structure those articles in response to that by putting the citations much higher with clearly labeling, like this came from Forbes and linking to some other websites that had followed the news. I think that actually shows that these norms are sort of being created as we go, and it's sort of-
Eric Newcomer:
Where they're learning them in public, this is why... I mean, tech companies get to sort just fuck it up and say we're new... I mean, we both like startups, but it is sort of the classic like, oh, we just get to stumble into an industry we don't understand, break all the norms and then get sensitive when people are overly angry at us. It's like, yeah, maybe you should understand how people think about this if you don't want the negative media hits. I mean, it's the classic get fights with people who buy ink by the barrel at your own peril. So yeah, they're stumbling into an industry with a loud-
Reed Albergotti:
Yeah, sure. But I also think that there's such a thing as trying to be an impartial and fair journalist when it comes to writing about this stuff. I mean, if you want to go on Twitter and you want to talk a bunch of shit, that's fine.
Eric Newcomer:
That's all I did. I have not written about this. I think we aggregated. Yeah.
Reed Albergotti:
You can go do that. That's totally fine. I just think the stories that Forbes was writing were so calling into question this guy's character. They were saying that he lacks human values and stuff like that as a person. I'm like, geez, guys, you haven't even given him a chance to try to change things, and they actually did. They had their engineers work overnight and try to fix this problem immediately, which I think shows some goodwill, right?
Eric Newcomer:
I have a core problem here. I do think there's a core issue, and it's that AI, which I literally host an AI conference for context, AI is all about aggregation and pulling together lots of disparate threads. And I think a key insight that reporters have, and Reed, you must agree as someone who's written some of the best novel reporting, sometimes stories are singular. I feel like tech has this idea that, oh, we just pull together a bunch of stories and we'll have the best one. And it's no, often it's the case that there's one great story, and so there's no aggregation. The aggregation of between different sources is totally a facade to cover up the fact that you wish you could just republish the actual story that has the goods.
Reed Albergotti:
Yeah, yeah, yeah. I know. I mean, totally. But again, then it gets back to... There's a lot of stuff that I think is so wrong about the way the media does this to each other, as you said earlier.
Eric Newcomer:
You're like, I've been burned so bad, I just gave up. They rip me off so much that whatever you guys rip off, now AI's ripping off. It's all a loss.
Reed Albergotti:
No, look, I think it's trying to really pinpoint the issue, Eric. Is the issue ripping off things on the internet? Because if that's the issue, then let's all get together and let's have a conversation about ripping things off on the internet. Or is the issue, hey, we're cool ripping things off, as long as it's a human ripping it off? And then it's an issue around, well, I don't know. Do we want the aggregators to have jobs, or something? Is that what we're arguing about? I just want to know what we're arguing about really at this point.
Danny Crichton:
I think this is what's interesting is the language we're using, right? We're talking about theft, stealing, but then the response to this is boos and norms, "Shame on you" to an AI bot that's just going around the internet hoovering up everything on the internet. I think the question is, what is the acceptable rule? And legislators have been trying to deal with this over the last couple of years as they've been doing link taxes. So Google has faced news laws in Canada, Australia, EU, California, with the idea that if you do try to aggregate news, you have to pay some sort of tax. That's not just a referral traffic to those sites, an actual currency transition to those companies, which has obviously been fought massively back. And that's just for a link and a snippet of text. So that is Google News shows a headline and 12 words of a story, and legislators are trying to push that. This is way beyond the line.
Eric Newcomer:
Right? Because in classic legislation, they're fighting the last battle. The glory days were Google linked to you and you got traffic. That is now good.
Reed Albergotti:
Wait, wait, wait. The glory days?
Eric Newcomer:
What? Yes. Because now what happens... I mean, we're arguing about Perplexity, but really the villain is-
Reed Albergotti:
You need to go read All the President's Men or something if you think those are the glory days.
Eric Newcomer:
No, dude. You know what. I'm saying that, relative to today, is much better for media. At least when they were linking versus trying to just deliver the full answer on page, not send you anywhere, that was... Now what we're facing where they're just totally rewriting it, that's a real danger because the media sort of cut out altogether.
Danny Crichton:
Well, I take this in a different direction. I think one of the interesting things... We've been very focused on journalism, and obviously we've all been in there, me less so these days. Used to, still in the fray. Good luck to both of you. But you think about knowledge work, I think more broadly of, look, there's a lot of categories of knowledge work that went away. People used to compile phone books. It was really valuable. You couldn't connect to anyone, and then we had the internet and we didn't need this anymore. You look at travel fairs. You used to have whole booklets on how to find the right flights and all this hacking. I can keep on going and going and going. And so my perspective, positive view would be there's a ton of work that, yes, will get automated that may require displacement. So this I think is in line with what Eric is sort of seeing this. You move on from a particular location.
So maybe in local journalism, for instance, if every city council meeting in every courtroom has a video camera, you actually do have the ability to annotate. You do have the ability to actually pull on all this stuff. And so instead of having no eyes, there is actually an eye. In the same way that we're seeing this in our own meetings in the corporate world, we could actually aggregate what took place. You get, hey, there were eight major court cases yesterday and two of them resulted in guilty verdicts, and that was unusual and no one to even be involved. The question is, are we creating a media economy that drives more and more towards opinion? As the unique differentiators, the unique... It has a voice, therefore, I can be a human being. I can have my place in the universe. Because otherwise, you're doing something that an AI can do, and therefore you're fundamentally competing against an LLM that fundamentally will always beat you because it does not sleep in and you do.
Eric Newcomer:
I just don't think we're that close to all the information being out there. I mean, one of my major worldviews is just that people should be shocked how much they don't know. I feel like there's so many well-kept secrets that I don't feel like we're close at all to just like, oh yeah, the machines have put everything online, anything you want to know. I mean, there's obviously big stuff, like what does Mark Zuckerberg's life look like, who's his day-to-day inner circle, tons of things that inform the decisions he makes and how his company is going to perform. But also just go to Mexico City and what are the top 20 restaurants that people in my demographic world are going to like, that's a very hard question to get a factual answer to. We're not close to the machines or anyone delivering all the information that people would want to know.
Danny Crichton:
Let me ask you, where's the... Reed, you used the term discretion. And then Eric, earlier in this conversation, you talked about these singular stories, that there's a definitive piece, the one that brought it all together, the holistic one, the Pulitzer Prize winning potential. And I think that this is one of the things that is so hard to grasp within the context of AI, particularly when we talk about creative work, which is that discernment, that eye for judgment, discretion is your word... When I think of what an editor's job is in the context of media, it's not just like, look, we're just going to cover all the stuff that's happening in the world. There's a million things going on in the world. Okay, there's eight million people in New York City. We could follow every single one of them, and people do. Humans of New York famously follow hundreds of people across the city to look at their stories, and you suddenly realize, wow, there's so much diversity, so many things going on.
But that's not what you do as you're an editor of a major paper. What you're trying to do is select for your audience's time. This audience has 20, 30 minutes, and what do I want to deliver them that's the most valuable thing that they could learn today, which may not even be the top story of this particular time. So the Atlantic just came in. I still get things on paper over here, and George Packer has a cover story that I think is 65 pages of the magazine. It has a read time of two and a half hours because the editor, Jeffrey Goldberg was like, you are an amazing writer, have won a Pulitzer. You found this amazing subject about climate in Arizona. Let's just give you this massive canvas, and it's worth our audience half the magazine for this month. That is what we're going to deliver is just this one story.
And I feel like that's intention with the idea of AI, which is, well, you can ask a question about everything. Everything sort of has the same voice, because we're not giving any prioritization. We're not giving any sense of drawing the reader's attention to something, which is part of the, I think, responsibility of both the journalists and editors.
Reed Albergotti:
I wonder if you've tried to summarize that, right? A, that's probably a good service because how many people are actually going to read the whole thing? I mean, it's probably great. I'm just saying, who has that time? But I mean, I think... Would it really take away from that piece? Are there people who are like, well, I would've read that whole piece, but now I can read a summary of it, so I'm not going to? I feel like those are just two different audiences.
Danny Crichton:
I mean, look, is it the best way to present a fact? I can learn about Phoenix water supply issues in three charts and a graph in an Axios story, Smart Brevity. I mean, I'm not reading George Packer for two and a half hours because-
Eric Newcomer:
There are some things you wouldn't believe unless you saw them show them-
Danny Crichton:
I mean, I guess that's a question of persuasion, right?
Reed Albergotti:
Totally.
Danny Crichton:
Where's the point of persuasion? Like-
Eric Newcomer:
It's like, yeah, you could give the main takeaway and the most important fact out of the article, but you're not going to give the emotional resonance of the article to a human being. And so that ties into the idea that stuff with humanity is going to be the most resilient stuff.
Reed Albergotti:
When I was at The Information, we were behind a paywall. This was like 2015 to 2019 time period. And at that time, everybody was like, "No way. Paywalls are terrible." And you'd write a story, and then some tech blog would aggregate it, and they would literally be like, "Warning, paywall" in the link, as in like, don't click on this.
Eric Newcomer:
I know. What bastards.
Reed Albergotti:
Don't click on this. It's a paywall. And so-
Eric Newcomer:
You might get a disease.
Reed Albergotti:
So I would write these pieces. I'd write some of these long articles, and they would have multiple scoops within the article. And we had this debate about, well, should we just publish the scoop, just get it out there? And I was like, no, we're behind a paywall. We have to force people to click through and read this article. So I would save scoops and try to... My goal would be to have five scoops in every article I wrote, and then we would aggregate our own article. So we'd write a little bullet point like, what would Business Insider... We literally had this conversation with Jessica. It's like, what would be the business insider aggregation of this article? And then we would just write that. And it was a lot of fun. You just figure out how to game the system, and it worked.
Eric Newcomer:
We should talk about just, maybe this is on the agenda, but the fact that it turns out that Perplexity seems to be going around paywalls, or at least Wired has reported that it's not just Perplexity seems to be some sort of naive, innocent. It's that-
Reed Albergotti:
Well, you're talking about the Robots.txt story that Wired did, right?
Eric Newcomer:
Yeah. Right.
Reed Albergotti:
So not that they went around paywalls, but that there's this file that websites have called Robots.txt, and it's kind of meant for indexing, and it's like you're either don't index this page or do index this page, but now it's become this sort of like, well, Google's allowed to index this, but OpenAI is not, and Perplexity is not. What Wired said was, Perplexity, the main perplexity bot is not indexing their site, but that they had a secret IP address that they believe is owned by Perplexity, and that that's indexing it in some way. I think there's a real question there about what really happened, and I think someone should get to the bottom of it, but I mean, did you see the story that just came out in Reuters? Reuters just did a story saying that, well, everyone else is doing this too. Literally, all the AI companies are allegedly getting around these things. So the Reuters piece made me think, well, so was the Wired piece sort of unfairly singling out Perplexity in this broader land?
Eric Newcomer:
You love Perplexity. You're like, boy, do I love Perplexity.
Reed Albergotti:
Yeah, I guess it sounds like it, but I do think it's an example of... It's like if you just wait and take a step back, it's like there's a bigger picture here. I really think that. I mean, I'm not trying to be like, let me defend this one company. I'm kind of using it as a... I guess I'm using it as a way to get into, I think, this larger issue.
Eric Newcomer:
Yeah.
Danny Crichton:
Well, I read the Wired article and I was like, there's a secret IP address that they didn't announce. And to me, it wasn't necessarily nefarious for such a young company. I was just like, some engineer turned on another thing and didn't tell anyone, and no one even knew what was happening. And so there's just that lack of policy infrastructure that you get when you have a hundred thousand employees at Google or Facebook or whatever the case may be. But I think the larger question is really around paywalls, which is there's a consent around Robots.txt. I say, Hey, if you're a search engine, do not scan my results. You cannot summarize, do not read, and they're ignoring that. And so there's an implied implicit contract, and going back to norms from the earlier part of our conversation, there's a norm on the internet that you have to follow Robots.txt. I've told you not to read my stuff, you read my stuff, and that is sort of violating a norm. And I think any company should be sort of targeted over that.
But I do think there's this larger question of paywalls. I used to run a paywall. I remember valiantly trying to get people to write behind it when we were working at TechCrunch on a site that is mostly not in front of the paywall. Extra Crunch led for two years only to see it murdered earlier this year. Now there is no paywall on TechCrunch. And so I think it's interesting because TechCrunch is the first media site that I have seen that has gone full paywall and is now going the complete opposite direction. In the context of the artificial intelligence world that we're seeing, do you think that that's the future? When you think about media economics... Eric, you are a paywall or half paywall, kind of metered-ish, but there's stuff behind it. Semafor, talking about paywalls seems have been always part of the story, but I don't think there's anything we officially cut off.
Reed Albergotti:
We don't have any paywalls yet. It's all free, so you should subscribe to my newsletter because it's free.
Danny Crichton:
Yeah. Still, yeah, because it's been there. Yeah.
Reed Albergotti:
But yeah.
Danny Crichton:
Far cheaper than Eric's, yes.
Eric Newcomer:
There was a period where paywalls were going to save journalism, and I think people have certainly moved on from that. I mean, a paywall is, after events, my biggest revenue stream and is important to the business. I charge the richest people in the world to read. So I feel like I'm unique. And I mean, I do think customers just hate paywalls. I can tell, when I charge sponsors or ask people to pay for tickets, people know what they're getting. I feel like media really hurt itself by training everyone to get information for free. People resent them. They're still important. I think niche paywalls where you really know what getting make the most sense.
Reed Albergotti:
I'm with you on that. I think you do high quality stuff and you charge the right price for your audience, and you don't have programmatic advertising. You have some advertising, but it's not programmatic and it's not based on a per click payment system and you figure out how to make the economics work that way. That's the purest form of journalism. I love that method. I think what you're seeing out there is the soft paywall, which is really having your cake and eating it too. Well, we're going to be behind a paywall, but we're also going to make it free and we'll get a distribution, but also we'll collect some ad revenue that way. And I think that's sort of confusing to customers.
I think it also just creates situations like this Perplexity one, where it's like, yeah well, our Robots.txt thing, well we want to be scraped by Google and we want to be scraped by Bing, but we don't want to be scraped by this company. And everyone just has to follow this thing. Meanwhile, there's like 8,000 different bots that don't give a crap what your Robots.txt thing says, and they're scraping everything anyway and you know that. That's a messy economic system, I think.
Danny Crichton:
Where do the rules come from? Do they come from the industry? Do they come from government? Do they come from norms, courts? Ultimately, when we talk about this kind of mess, I mean, it's been a mess for decades. So at some point, you sort of feel like, God, there should be some rules of the road. Here's where you can do, here's where you can't do, and there's some blurriness maybe, but it feels like it only gets more complicated. I'm so perplexed. To use the name, I'm freaking perplexed. And it's 2024 and I still don't know what the rules of the road compared to what they were before.
Reed Albergotti:
No, the rules are, it's not illegal to scrape anything on the web. I mean, this went all the way to the Supreme Court. It's like the LinkedIn case. So I mean, you can scrape whatever you want. So this is totally an industry led voluntary thing around... It's like being a good citizen on the internet, and we all know everybody's a good citizen on the internet, right?
Eric Newcomer:
Well, I think this is honestly where Reed and I most viscerally disagreed online, in that I'm very supportive of heavy norm. I feel like we need more norms in society, more booing. It's good to enforce things by moralizing. I feel like people are like, oh, you can do whatever you're allowed. It's like, absolutely not. That's insane. So I mean, there's a world in which I wish there was regulation, but then regulations always favor the incumbents. It's always a disaster. The government is so disconnected from what we actually do. I'd much prefer a few boos or at least try that strategy before regulation.
Reed Albergotti:
Yeah, no, that's totally true. I just think two things. One is if you have a truly bad actor that's not willing to change, then yeah, get out the pitchforks. That's fine. And the other point is, I made this earlier. If you're writing article about somebody, just be fair. The Wired article we talked about, the headline was something like "Perplexity's Bullshit" or something like that. It's like, well, are you saying the product...
Danny Crichton:
It was specifically, "Perplexity Is a Bullshit Machine."
Reed Albergotti:
It's not bullshit. It's like-
Danny Crichton:
It's a machine.
Eric Newcomer:
But bullshit is a term of art in terms of information.
Reed Albergotti:
Sure. But dude, you pick a topic. Is it that they're being a bad citizen? Is it that the product sucks? I think they even said in there that they're not really an AI company. Is it like you don't respect their technology? There's three different articles there. Pick one or write three different articles. I'm totally confused what your point is here. I use it. I think it's good product. It sometimes hallucinates, but whatever.
Danny Crichton:
Well, maybe on that note, Reed, Eric, thank you so much for joining us.
Reed Albergotti:
Oh, I got the last word. Thanks.
Eric Newcomer:
Thanks for having us.
Danny Crichton:
All right.
Reed Albergotti:
It was fun.