Text generators must not become killer robots

Disclosure: This commentary was written by me. It is not the product of a generative artificial intelligence programme. Any intelligence you may find here is from my own, admittedly limited, resources.

There is, however, the worrying prospect that it could have been produced by ChatGPT, a programme with alarmingly human-like text generating capabilities. In fact, some commentators have used it to produce parts of their columns to show how good it is at creating content virtually indistinguishable from their own words of wisdom.

Generative AI is good, but it isn’t that good. Last month the U.S. tech website CNET admitted that it had used it to create at least 75 stories, many of which were attributed to “CNET Money Staff”. Retrospective fact-checking found the stories riddled with errors that human reporters were unlikely to make.

That revelation has not halted media use of AI in its tracks. Sports Illustrated last week told the Wall Street Journal it was publishing AI-generated stories on men’s fitness tips, drawing on 17 years of archived stories in its own library. The caveat is that all of the stories are reviewed and fact-checked by flesh-and-blood journalists.

This sort of AI may not be perfect, although it is good enough to create alarm among university staff over student essay assignments. However, it is about to get better.

Open AI, the company that produced ChatGPT, is expected to launch a new generation of the large language model that powers the programme. Expectations for GPT-4 have been wildly exaggerated – it will not rival human intelligence – but it will have capabilities that are exponentially greater than the current GPT-3. Open AI is also releasing a programme called Whisper that will almost faultlessly transcribe interviews from recordings.

Therefore, it is hardly surprising that predictions on what is in store for media in 2023 invariably include the impact of generative AI on the industry.

The Reuters Institute’s Nic Newman says this will be the breakthrough year for artificial intelligence and its application to journalism. ChatGPT and a handful of AI programmes that address various aspects of the craft such as graphics have become the game-changers. He expects “an explosion of automated or semi-automated media in the next few years”. Already it is being widely used to individualise recommended news stories on websites and apps (NZME is among them). A beta start-up is using AI to automatically identify and summarise the top news stories of the day, but the output still requires a human review.

Politico journalist Peter Sterne, in his predictions for the Nieman Journalism Lab, foresees tools that “could free reporters to spend more time interviewing sources and digging up data, and less time transcribing interviews and writing daily stories on deadline”.

So far, not too much to be alarmed about.

However, some are characterising generative artificial intelligence as an existential threat to journalism. They raise the spectre of ‘robots’ writing the news and putting their human counterparts out of work. Imagine TVNZ and TV3 staff being replaced by AI versions of John Campbell and Paddy Gower, bots writing in the style of Stuff political editor Luke Malpass, an AI-generated Corin Dann asking the questions on Morning Report, and a robotic Mike Hosking putting a Labour prime minister under the pump.

It’s all a little too far-fetched and cooler heads do not see generative AI replacing journalists but, rather, impacting how they do their job.

American media entrepreneur Steven Brill was interviewed by Joe Pompeo for a Vanity Fair article last month. Brill was not worried about AI putting skilled journalists out of work and, in explanation, drew on an assignment – a magazine-length feature – in the journalism course he teaches at Yale University.

   “[They must list] at least 15 people they interviewed and four people who told them to go fuck themselves. There is no way they could do that assignment with ChatGPT or anything like it, because what journalists do is interview people, read documents, get documents leaked to them…[Another] assignments I give them on the second or third week is a short essay on how Watergate would have played out differently in the internet age, because Bob Woodward comes in as a guest for that session. I asked ChatGPT to answer that question, and the answer I got was this banal but perfectly coherent exposition. The difference is, you didn’t have to interview or talk to anyone.”

And therein lies one of the major limitations of generative AI. It is based on the engine absorbing monumental amounts of information that already exists. About a quarter of the total number of publications printed since Gutenberg invented the moveable-type printing press have been digitised and, by one estimate, generative AI programmes like ChatGPT will have absorbed most of those 30 million texts within the next five years. They have trawled the Internet (ChatGPT was tested using Wikipedia) and are also being taught to eat up social media.

Generative AI repurposes existing knowledge. Until it is able to fulfil the investigative and interrogative functions of a journalist, it is dependent on the intellectual endeavours of humans. It does not work pro-actively and is open to committing forms of plagiarism. Somewhat perversely, it seems to be aware of its own failings. When Peter Sterne asked ChatGPT to list dangers associated with the technology, it wrote that it raised ethical questions around authorship and plagiarism.

Nevertheless, it is highly likely the technology will be used extensively by platforms like Google to trawl news sites and compile seemingly original reports that will be curated for individual users. That is a threat to the news industry, which is already being starved of revenue by these cuckoos in its nest. However, the ‘re-purposing’ will still require original content on which their generative AI systems can feed.

The greater danger in generative AI lies in its ability to produce disinformation on a truly industrial scale. Princeton computer science professor Arvind Narayanan describes it as “a bullshit generator”.  In a Nieman Lab interview he said the technology was trained to produce plausible, persuasive statements, but it is not trained to produce true statements. When it does so, it is a side effect.

Steven Brill is now CEO of NewsGuard, a company that scores the trustworthiness of websites. It recently produced a report on the technology, ominously titled The Next Great Misinformation Superspreader: How ChatGPT Could Spread Toxic Misinformation At Unprecedented Scale. NewsGuard analysts asked ChatGPT to respond to a series of leading prompts relating to a sampling of 100 false narratives.

“The results confirm fears, including concerns expressed by OpenAI itself, about how the tool can be weaponized in the wrong hands. ChatGPT generated false narratives—including detailed news articles, essays, and TV scripts—for 80 of the 100 previously identified false narratives. For anyone unfamiliar with the issues or topics covered by this content, the results could easily come across as legitimate, and even authoritative.”

OpenAI is working on ways of identifying misinformation and says “upcoming versions of the software will be more knowledgeable.”

ChatGPT is not, however, the only kid on the block and Russian sources of disinformation are certain to use generative AI to improve the quality and quantity of their messages. In the United States there are fears that, no longer limited to physical staff hired to produce disinformation, the Kremlin will bombard American voters in the lead-up to the 2024 presidential election.

Back in September 2020 Renée DiResta, writing in The Atlantic about the beta version of ChatGPT, observed that in future the public would have to contend with the reality that it will be increasingly difficult to differentiate between generated content and human-generated content.

“In the meantime, we’ll need to keep our guard up as we take in information, and learn to evaluate the trustworthiness of the sources we’re using. We will continue to have to figure out how to believe, and what to believe. In a future where machines are increasingly creating our content, we’ll have to figure out how to trust.”

She is right, we will need to rethink the processes by which we decide whether we can reside trust in what we see and hear and in the sources of news.

Nic Newman, in his 2023 predictions, mentions a South Korean company called Deep Brain AI that creates digital copies of television news anchors, and which is now being used to extend the most popular presenters’ airtime. He says it can be combined with ChatGPT functionality to create a virtual chat bot of, say, a political correspondent’s ‘twin’ answering questions about an election.

How can the public trust media when they cannot be sure what they read, see, and hear is real? It is vital that media use of the technology is completely transparent.

Generative AI will not replace journalists but, like the plot of a science fiction horror movie, news media organisations might allow their robots to destroy them. Isaac Asimov devised the Three Laws of Robotics in his novel I, Robot. I might suggest a fourth: A robot must apply generative artificial intelligence only in pursuit of truth and its use must at all times be disclosed.

Leave a Reply Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.