AI-generated text has the potential to sap the very lifeblood of the internet’s ad-dominated ecosystem. Just how much is the main question, but a new report says it’s a growing issue that major ad companies like Google have yet to fully grapple with.
NewsGuard, a company that sells online accuracy tools, released a new report on Sunday showing there are dozens of content farms automatically generating thousands of pieces of content every day, most of which is created by AI. The sites themselves require little or no human oversight. One identified site, World-Today-News.com, produced more than 1,200 articles a day and close to 8,600 articles in a single week in mid-June.
Other so-called made-for-advertising sites were trading in potentially harmful content, such as junk site MedicalOutline.com proliferating clickbait headlines like “Can lemon cure skin allergy?” That article starts off with the line “As an AI language model, I do not have the ability to provide medical advice.” Dozens of other junk news pages published under the name “Athul” promote dubious herbal remedies. A Google AdSense widget on the side of that page was still promoting ads for various health products as of Monday afternoon.
These sites seem exclusively built to abuse programmatic advertising, AKA the automated systems for putting ads on pages. This growing spam-a-thon is a major problem for Google, as the vast majority of these ads—90%—were being serviced through Google Ads.
In an email statement to Gizmodo, Google spokesperson Michael Aciman said: “We have strict policies that govern the type of content that can monetize on our platform. For example, we don’t allow ads to run alongside harmful content, spammy or low-value content, or content that’s been solely copied from other sites. When enforcing these policies, we focus on the quality of the content rather than how it was created, and we block or remove ads from serving if we detect violations.”
Aciman also said that Google removed ads from the websites found violating the company’s anti-spam policies. While he didn’t specifically name the sites, the spokesperson added that in some cases, Google removed ads from specific violating pages.
How Has Google Responded to Made-For-Advertising AI Spam?
Both Google’s automated systems and policies haven’t been able to keep up with the wash of new sites popping up that are specifically built to suck in ad revenue with knockoff AI content. MIT Technology Review, which first reported on the NewsGuard report, said Google claimed it had removed ads from sites like Medical Outline, but when Gizmodo checked Monday afternoon the ad widget was still spilling out ads on the site, even on the page that was explicitly AI-generated.
NewsGuard said it had identified 217 of these junk news sites operating in 13 different languages. The report’s researchers noted they have identified more than two dozen such sites every week since the company started tracking the issue in May. The company then found close to 400 instances of ads from 141 major brands that appeared on 55 of those junk news sites. These ads appeared while browsing in the U.S., Germany, France, and Italy.
ChatGPT and other AI chatbots are already helping people produce thousands of fake articles online. Of course, that doesn’t do anything to help stem the tide of online misinformation, but this latest research does confirm that the other side of how the internet works—the people-as-product advertising model—is being invariably impacted by the deluge of AI-generated online content
Not all the content on these sites is AI-generated, and some of the junk is copied wholesale from other parts of the web. While systems used to detect if the text is AI-generated are notoriously unreliable, NewsGuard noted when an article contains text like “As an AI language model, I don’t have the ability to taste food” or other artifacts often found in popular chatbots like ChatGPT. These articles might contain more than a dozen Google AdSense widgets, some of which are broken on the page while others display ads for major banks, mattress sellers, and other e-commerce companies.
The brands likely have no idea that their ads are being showcased on spam sites. These companies bid on spots to run their ads, and then those ads get served to sites for the sake of generating clicks and eyeballs from consumers. The labyrinthine nature of the programmatic ad process means companies don’t really know what sites are being served their ads. For that, they’re supposed to rely on Google and other companies’ automatic curation systems.
Google’s own policies state that ads are supposed to be restricted from “Spammy automatically-generated content.” While the page does not explicitly reference AI, any kind of spam site would include “text generated through automated processes without regard for quality or user experience” or sites that don’t produce anything original or add value. Notably, Aciman noted that AI-generated content on a page isn’t an “inherent violation” of its policies.
“These are pre-existing policies that apply to all content, regardless of whether it’s generated with AI,” the spokesperson said regarding its current anti-spam guidelines.
Is AI-Generated Content a Big Problem for Legit Websites?
It’s hard to gauge how much impact this has on the overall health of the web. Companies like Google and Meta have kept the systems for their massive advertising empires opaque. When working within Google’s closed-door ad system, companies and campaigns simply do not know exactly where their ads may appear.
Web ads are based on an average monthly cost per 1,000 impressions or CPM. As MIT Technology Review noted, that cost has actually gone down early in 2023 to about $1.21 per thousand impressions, according to a February report from Digiday based on industry benchmarks.
A junk news site might not attract much or any clicks for each article, but in today’s SEO-driven environment, it may just take one clickbait article to drive mountains of engagement. And even if every article only generates a few dollars, that’s less than the cost of generating the text in the first place. That’s potential advertising money not being sent to legitimate sites which depend on ad revenue to stay afloat. If Google and other ad providers don’t stay on top of these spam sites, the situation could get much worse.
Now, this isn’t a massive problem for any one party, but it’s a compounding issue with the general health of the internet. With more sites financially incentivized to drown the web with fake, AI-generated content, the more likely users are going to stumble onto false information. It does remain a problem for both Google (plus competing ad providers) and the brands themselves, as neither company explicitly wants to promote spam, especially when these sites could be facilitating relatively few clicks for each piece of digital content.