Advice from the WSJ’s R&D Chief on defining and implementing an AI strategy

From creating news stories automatically to optimising content delivery, an increasing number of newsrooms are making use of AI to automate and augment their reporting and other newsroom processes, making workflows more efficient, speeding up time-consuming tasks, and increasing the breadth of their coverage. Here are some questions to consider before introducing AI in your newsroom.

by Simone Flueckiger simone.flueckiger@wan-ifra.org | November 28, 2019

This article is an extract from the second report in WAN-IFRA’s Trends in Newsrooms series, “AI in the Newsroom”. The report is free to download for WAN-IFRA Members, and can be purchased by non-members. It can be accessed here.

“If you think about the news process, you have essentially three main steps of value creation; news gathering, production, and distribution,” says Francesco Marconi, The Wall Street Journal’s R&D Chief and author of the upcoming book Newsmakers: Artificial Intelligence and the Future of Journalism. “AI is already impacting and transforming all of these different areas.”

As the head of R&D, Marconi leads an interdisciplinary team comprised of machine learning engineers, data scientists, and specialised editors, focused on providing access to AI capabilities to everyone in the newsroom.

More specifically, his team conducts research on technological trends, creates data-driven stories and automation projects, and builds news gathering tools for journalists; one example being a content analytics platform powered by machine learning to scour The Wall Street Journal’s archive for editorial insights.

How to define your AI implementation strategy

While the adoption of AI is still largely confined to bigger news organisations with more resources at their disposal, it is, in fact, an approachable technology newsrooms of all sizes can use, and cheaper than many think, says Marconi.

“It’s important to really understand that AI is not about technology, but it’s about cultural change and it’s about being responsive in understanding that these tools can augment the capability of the journalist.”

Before introducing AI into their workflows, newsrooms should thus understand the problem they’re trying to address, and in what way AI might be able to help solve it, taking into consideration that there could be easier solutions available.

The second step is to define whether AI will be applied to content or processes, and whether its purpose is to automate or augment these internal workflows. As an example of automation, AI could be deployed at scale to lower cost of production, whereas augmentation can refer to creating new forms of storytelling or content.

Lastly, newsrooms need to make the decision whether to build technology in-house or partner. Naturally, developing tools internally is a more costly approach that requires access to manpower in the form of machine learning experts, data scientists or specialised editors, but comes with greater customisation abilities, as well as more control and stability.

As a way to deploy AI in a more cost-effective way, Marconi suggests to seek out opportunities to collaborate with journalism schools or computer science departments that are looking to have real-world experiments.

Uses of AI in newsrooms

There are plenty of examples of how newsrooms are using AI to optimse workflows. In news gathering, it can cut down on time spent on the task, helping journalists gather insights more quickly, but also offer a potential advantage over competitors.

With this in mind, Reuters launched a tool called News Tracer in 2017, designed to sift through millions of tweets a day to flag potential breaking news events, often identifying them more quickly than other news organisations, according to the company. In a similar fashion, Reach (previously Trinity Mirror) announced late last year that some of its regional newsrooms would deploy an off-the-shelf AI-powered tool to monitor some 60,000 online sources and alert journalists to “pre-trending” stories.

AI systems also show potential for investigative journalism by helping reporters analyse huge amounts of data, and enabling them to quickly find relationships between different entities. The International Consortium of Investigative Journalists (ICIJ) uses an AI-powered tool to automatically recognise and index text documents. This smart software was used by ICIJ reporters to make sense of 13.4 million confidential documents relating to offshore investments, an effort that eventually became an impactful journalistic series – “Paradise Papers: Secrets of the Global Elite”.

At the production level, the use of algorithms promises efficiency and scale, be it through automatically generating news texts based on structured data, switching between media types or repurposing content for different platforms.

Automated content generation is already relatively prevalent, with a number of larger newsrooms making use of natural language generation to produce “commodity news” as a way to expand their coverage and free up journalists’ time so that they can focus on more in-depth reporting tasks.

The Associated Press (AP), where Marconi previously co-led content automation and artificial intelligence efforts, started doing this with its corporate earnings reports a few years ago, which enabled the news agency to increase the number of companies it reported on from 300 to 4,000. Similarly, MittMedia, one of Sweden’s largest local media groups, launched a sports bot to cover lower-level leagues and a wider range of sports, which generated 41,000 articles from September 2017 to September 2018, and even helped drive paid subscriptions. Other examples of how AI can enhance production include automated creation of videos from text, or data sonification; an emerging field the The Wall Street Journal is currently experimenting with, which algorithmically transforms numerical data into musical chords as a way to make charts and data visualisations accessible to visually impaired people.

“It’s an emerging example but it’s fascinating how AI allows us to create news experiences that we never thought we would,” Marconi says.

As for distribution, the third step in the news value chain, AI can help determine the best time to publish a story or personalise the delivery of content to the audience. Media organisations including Reuters, the Chicago Tribune, Hearst, and CBS Interactive deploy AI-powered content distribution platform TrueAnthem to determine what stories should be recirculated and when they should be posted across social media platforms. To make these decisions, the system tracks signals that predict performance including the level of audience engagement, publishing frequency and time of day.

AI is not without risks

AI may involve sophisticated algorithms, but the conclusions drawn by machines are not always correct. Journalists have to be always questioning outcomes, validating methodologies and ensuring explainability. This is no easy task: algorithms are difficult to audit and, as such, to hold accountable.

“The insights generated through AI should be used as a compass that guides reporting, not as a clock that provides infallible information,” says Marconi.

“AI is created by humans, and it can make mistakes, errors which often result from the biases in how AI is designed as well as the data used to train it. The output is only as good as its input.”

Increasingly, it is being acknowledged that AI software is prone to the same errors and biases that humans exhibit and can even exacerbate these inequities since they are often implemented on massive scales with little oversight. In an investigation by ProPublica into machine-generated criminal risk scores, reporters found the software to be biased against black defendants. Reporting of this nature will be increasingly necessary for an algorithmically driven society to keep software accountable.

Deepfake videos are another emerging risk of AI-driven content generation. They refer to images or audio files generated or altered with the help of Artificial Intelligence to dupe an audience into thinking they are real. The prospect of the next generation of misinformation is troublesome. For example, fake videos could make politicians appear to say controversial things or falsely implicate people in crimes.

The deepfake also poses a threat to journalistic trust and integrity. Now, in addition to traditional fact-checking processes, journalists must also be vigilant about the possibility that video or image evidence could have been falsified. Some newsrooms are taking proactive measures. For example, The Wall Street Journal launched a media forensics committee to train journalists; the New York Times is exploring how to leverage blockchain to map the provenance of information; and Reuters created its own deepfake video in collaboration with a specialist production company to test whether its user-generated content team would notice that it had been altered.

But Claire Wardle, Executive Director of First Draft, an organisation that supports journalists in the fight against misinformation and disinformation, cautions that the alarm over deepfakes may be overblown. “‘Deepfakes’ are no more scary than their predecessors, ‘shallowfakes’, which use far more accessible editing tools to slow down, speed up, omit or otherwise manipulate context,” Wardle told the New York Times in a video interview that demonstrated the transformative power of deepfakes.

“The real danger of fakes – deep or shallow – is that their very existence creates a world in which almost everything can be dismissed as false.”

Simone Flueckiger

simone.flueckiger@wan-ifra.org