Online life in the spam tsunami

4 min read

Opinion, 12/04/2023

Sam Peña-Taylor, WARC

Topics

AI promises ever more content, much of which, on current evidence, will be unreliable. Sam Peña-Taylor considers what that might mean for advertising.

What makes the internet worth using? There’s the entertainment internet, the endless scroll that takes up probably too much of our time but mixes up the stuff that’s kind of interesting to us and kind of fun. But then there’s the useful internet: the internet of detailed questions with detailed, empathetic answers, technical discussions between practitioners or experts – what happens when this is overwhelmed? As AIs gain in popularity and power, there’s a wider threat of unlimited rubbish overrunning the internet.

[Image: Dall.e.]

The internet isn’t just made up of cat videos; a lot of users are still looking for information. Witness the sheer vastness of search advertising which accounts for around 27% of all advertising spend worldwide, the largest of any medium, to describe how the internet and its economy relies on seekers of quality information, and ultimately selling against that activity. This is an issue with deep implications for advertising.

The kind of forums that open up access to well-intentioned and generous expert contributors, such as Reddit’s r/AskHistorians, realised they are at the sharp end of the spam tsunami, when moderators found last December (as ChatGPT was exploding into the mainstream) that more and more AI-generated spam was appearing on the forum.

“They’re not in-depth, they’re not comprehensive, and they often contain false information,” Sarah Gilbert, a post-doc at Cornell University and a forum mod told Vice. Soon, the tally rose to between five and ten AI-generated posts per day. Moderators suspected some of these users of wanting to “test” the gatekeepers’ rigour.

It’s not the only forum to experience the misuse of AI. Stack Overflow, a forum for developers to ask and answer technical questions, had to ban ChatGPT-generated submissions in December too.

“[B]ecause the average rate of getting correct answers from ChatGPT is too low,” moderators explained in a statement, “the posting of answers created by ChatGPT is substantially harmful to the site and to users who are asking and looking for correct answers.” (Emphasis is theirs). Users suspected of infringing the policy were threatened with immediate suspensions of up to 30 days.

At their worst, the floods of content are part of a coordinated effort from one or many actors. The trouble here is not so much how difficult these answers are to spot – for the time being – but that their sheer quantity can overwhelm moderators.

Forums like Reddit and Stack Overflow are only useful in as much as a community of contributors and moderators can maintain a good level of quality, meaning the user loses out eventually if this diminishes. But it’s more immediately dangerous in situations where generative AI users might start engaging in reward-farming (Karma-farming on Reddit).

Services that have operated a kind of help economy now appear under threat. GiffGaff, the UK virtual mobile network operator, maintains competitively low prices by operating a minimal customer service staff, instead delegating helpdesk services to helpful individual contributors who can accrue rewards like credit or even actual cash.

The opportunities for malfeasance are significant, if a user with a chatbot can suddenly crank out thousands of responses and happen on enough helpful answers to start collecting lots of money from the system it has subverted. This is to say nothing of the AI-generated scams aimed at individuals.

While idiots and crooks are unlikely to disappear anytime soon, the issue speaks to a wider concern about our burgeoning yet complex relationship with the technology. As these stories suggest, whether or not you as an individual are using AI tools, you are likely to be exposed to them and we will all require literacy in what they are and, ideally, how to spot them.

In education, some teachers and professors are starting to accept language model AIs as tools that help us to learn a lot about generative AI’s limitations and about the subjects we’re asking them to help us out with. In short, “hallucinations”, or instances in which the language model just makes stuff up, are useful from a learning perspective. They can help tell us what they don’t know and show us what we know or need to know.

Governments around the world are beginning to make noises about regulating AIs’ output, according to their own political complexion. The Cyberspace Administration of China has proposed that chatbots should “embody core socialist values” and must not subvert state power. In the US, the Biden administration has begun a consultation on “accountability measures”. Sooner or later rules will come, but probably not before AIs are put to use making mischief.

For the large platforms that take up a lot, if not most, of people’s time on the internet, this is a problem that could start to really crunch in a couple of years’ time, when AI-generated content is so prevalent that models are increasingly training on the output of their previous iterations. How do we scale the moderation or gatekeeping function? How do the ‘I’m not a robot’ checks develop to ensure that human-generated content is marked and valued as such?

Ultimately, we will need to be more critical of any and all claims on the internet, which is difficult and time-intensive to do, but isn’t that far from the skill of today’s canny Google-user. While this seems like bad news for advertisers, it emphasises the fact that marketing isn’t the process of convincing someone to click but rather a more complex human process of building trust through familiarity, fame, a reputation for competence, and – vitally – not being the risky option. As the internet becomes an ever-murkier space, understanding what really matters will be ever more important.

About the author

Sam Peña-Taylor
Reporter, WARC

View all of Sam Peña-Taylor's Opinion posts

Blog author

Sam Peña-Taylor is a reporter at WARC.

Get a demo

Topics

Send feedback

www.warc.com

All rights reserved including database rights. This electronic file is for the personal use of authorised users based at the subscribing company's office location. It may not be reproduced, posted on intranets, extranets or the internet, e-mailed, archived or shared electronically either within the purchaser's organisation or externally without express written permission from WARC.

Online life in the spam tsunami

From Sam Peña-Taylor

Latest

About the author