Amazon has built a thriving cloud enterprise business around their handling of data, turning what was an internal system – the distributed servers that power their e-commerce operation – into a public-facing storage service, and leveraging what they understand about how data travels (and is stored) on the Internet to make the Amazon Cloud one of the most cost effective and reliable hosting solutions around.
Big data can be very boring – the value that it adds is often locked up in CRM systems, enterprise architectures and IT structures that are utterly opaque to those who don’t understand them.
Because of this, a culture of snake-oil salesmen has developed, every pitch promising to unlock some as-yet-untapped value from the masses of data languishing in archives and databases.
But for news media companies there should be a different concern. Sure, exploiting the data you have about your audience is fine, but that ignores the real value that big data brings to the newsroom: the power to reinvigorate investigative journalism, change the way that stories are spotted and evidence is gathered, and use your interpretation of datasets to reach a new audience and attract untapped audiences.
The key is to find journalists who are also data scientists or software engineers, that rare breed to whom finding a needle in a haystack is a favourite hobby. These tenacious newshounds can take something as benign as a crime statistics database and coax from it the most startling results: expose government spin-doctoring, highlight real areas for concern, or merely use it as the first step in spotting a neighbourhood or community that has seen a sudden spike in criminal activity.
The bigger picture
It is also important to think of data journalism not just as data visualisation. Sure, there are those out there like Hans Rosling or David McCandless who take issues hidden in data sets and visually interpret them so that it is simple to understand some vast and abstract concept, but experts like them are few and far between.
Data visualisation is a key part of the growing world of data journalism, but by no means the only part. In looking for someone who can make pretty pictures you first need to ensure that the person can draw the correct conclusions from the data at hand – and knows how to wrangle the information that you have collected so that it is clean, well formatted, and uniform.
That is by no means a quick task. I sometimes think that Hollywood has glorified the software developer/hacker as some superhuman, able in seconds to take some massive, unwieldy block of source material and magically distill it down to the salient points or unlock some explosive revelation from it.
Failure is an option
The truth is that great data journalism is all about patience, about trying and failing repeatedly until you finally coax the information you have to act as you want it to. Once in good shape, however, a clean data set can be used in a million ways: as context for related stories long into the future, and as stand-alone media products that can be sold independently of your publication to unlock new revenue from data tools that serve the needs of your audiences.
In the age of Wikileaks, Snowden and Manning, when source material is measured in gigabytes, not pages, it is vastly important for any news organisation to seek out and employ those who are able to work with data well.
These skills are alien to the traditional media organisation, but once put in place offer unique new opportunities for investigative journalism, for the exploration of big issues, and allow media organisations to take a step from being merely the conveyor of news to becoming, effectively, a public-facing intelligence agency.