16 April 2025
virtual

Synthetic Audiences and Personas for product development and testing

DATA SCIENCE MONTHLY MEETUP

In a fragmented media landscape, newsrooms must keep their audience at the center of editorial and revenue decisions. Synthetic audiences offer newsrooms a powerful tool to maintain this responsive approach without the delays and costs of traditional research methods. A synthetic audience is a digital twin of your actual audience. This AI-generated representation allows news organisations to test editorial products, marketing strategies, and advertising in a risk-free yet realistic environment. Synthetic audiences can be created and deployed quickly, allowing for rapid testing and iteration without directly involving real consumers. This meetup explored various applications for news organisations:

  • Content optimisation: Fine-tune article pitches, headlines, and formats for maximum reach and impact
  • Audience segmentation: Target narrow segments with hyper-personalised content and outreach
  • Predictive analytics: Anticipate market shifts and reader preferences by analysing synthetic audience behaviours
  • Advertiser services: Offer synthetic research capabilities to advertisers for precision-targeted campaigns
  • Data monetisation: License rich audience datasets to AI platforms and others, creating fresh revenue streams

Patrick Swanson and Kaveh Kaddell, journalists and co-founders of Verso, introduced the concept of synthetic audiences, AI-powered conversational agents simulating specific audience segments. Patrick, with a background at the Austrian public broadcaster OOF in Vienna, and Kaveh, with experience in data and investigations, aim to reimagine AI’s role in newsrooms beyond efficiency, focusing on simulating people through AI.

The core idea is that news organisations struggle to centre the audience due to resource constraints, making audience research expensive. AI offers a dynamic solution through digital twins that simulate audience responses, reusable for various questions without constant human interaction. However, they acknowledged that ethical and bias concerns must be addressed.

Kaveh explained that agent-based modelling has been around for a long time, but language models have dramatically opened up the decision space. Research from Stanford, University of Washington, Northwestern University, and Google DeepMind created over a thousand individual agents based on two-hour interviews with actual people. These digital twins were 85% accurate in answering questions compared to the actual people. This approach is not a replacement for engaging with actual people but a way to continue engaging when resources are limited or to bring people’s voices into smaller decisions.

They demonstrated VibeCheck, a tool that sends an article link to 15 personas, imagined on a general news audience, and aggregates their pass/fail votes on criteria like attention, relevance, clarity, engagement, and trust. This provides feedback to improve the article. They discussed applications like testing SEO and headlines, story framing, newsletter personalisation, and social messaging. Strategically, it can test product ideas, formats, and audience expansion, such as engaging young men. They also considered editorial focus, data sharing, and monetisation.

They emphasised that this approach should complement existing audience work, not replace it. Regular resurveys and re-interviews are required to update assumptions. Transparency and informed consent are crucial when creating digital twins. A diverse panel is also essential. They addressed questions about measuring accuracy, comparing different LLMs, and the potential for topic-specific personas. They also discussed the balance between synthetic and first-person data, noting that the choice depends on the use case.

DOWNLOAD THE PRESENTATION DECK

REPLAY THE VIDEO RECORDING

Speakers