Predictive models in a VUCA world
The chances of a catastrophe or the results from an election are typical examples of how the management of volatility, uncertainty, complexity and ambiguity need data-based decision-making today more than ever, for which predictive models are enormously significant.
Hurricane Sandy made landfall in New Jersey, to the south of New York, on 29 October 2012. It left 50 dead and around 19 billion dollars of insured damage in its wake. The challenge for predictive models was – and still is – gigantic. Not surprisingly given that they have to work with immense databases within geographic information systems. These databases include atmospheric conditions, sea temperatures, land elevation, and vegetation and building cover.
Besides managing all this information, the whole process has to take place almost in real time. The tracking of Hurricane Sandy reflects the whole process involved in a data science project as they are applied today: defining the questions to answer (Will it impact populated areas? When will it do so? How heavily? What are the potential risks?); exploratory data analysis (wind speed, atmospheric pressure, tracked trajectory, atmospheric and sea temperatures); defining the predictive models (not only considering past data but thousands of trajectory simulations, possible variable changes, both in combinations and individually, as well as the natural and architectural obstacles that the hurricane might encounter along its path that would affect intensity and trajectory); interpretation of results (in probability scales, from highly unlikely to almost certain scenarios); and – something essential to any data science project and even more so in a scenario where people’s lives are at stake – communication of the results. This is so important that the NOAA and NWS (the agencies responsible for tracking hurricanes in the Atlantic and the US national weather service) modified their warning and contact protocols – both internal and external – as the hurricane advanced. PowerPoint and PDF presentations were abandoned for Internet-based reports in HTML format. The NWS also focused its communication on the impacts – foreseeable locations and dates, the risks involved (winds, rising water levels, precipitation) – while the NOAA modified all its Internet-based communication protocols to create specific websites, which were visited almost 1.3 billion times as the hurricane developed, and used Facebook and Twitter on a huge scale to report on events.
“For his 2012 re-election campaign, the Obama team invested more than one billion dollars in data science”
Just one week after Sandy, another hurricane hit the headlines. On 6 November, Barack Obama was re-elected as President of the United States. A born communicator with the professionalism and humility to trust his re-election campaign to a team based on traditional communication techniques in combination with new techniques emerging from data science and behavioral psychology.
For his 2012 re-election campaign, the Obama team invested more than one billion dollars in data science. His four years in the White House had seen an unprecedented technological and cultural transformation in North American society. Whereas, Facebook had 33 million users in the US in 2008 and some 135 million worldwide, this figure stood at 166 million in 2012 with more than 900 million worldwide.
While Obama was being sworn in for his first term, the CND hired Dan Wagner as Director of Analytics for one basic purpose: to develop metrics that would allow in-depth knowledge to be gained of voter behavior. During the mid-terms in November 2010, Wagner not only predicted defeat but anticipated it by five months. Based on confirmation of his predictions, Wagner’s predictive models became the “gold standard” for the party. For TV alone, the budget amounted to 300 million dollars. They developed software – the Optimizer – that divided the day into 96 15-minute segments in order to launch the best message not only for the demographic associated with the program being broadcast but also the specific audience believed to be watching at the time, obtaining a return on investment far higher than traditional methods.
“One basic purpose: to develop metrics that would allow in-depth knowledge to be gained of voter behavior”
Something was happening and, worst of all for the director of Romney’s data science team, Alexander Lundry, he couldn’t explain it. The difference in ability to reach the advertising market was clear: while the Republican Party was spending its budget on supra-State channels, the Democrats were spending their money on unexpected programs during strange time slots, breaking the established mold and acting out of the box. Eric Schmidt, the President of Google at the time, said it was “the best election campaign ever conducted in history”.
These two events, taken from the book entitled Alquimia (Deusto, 2019), can be used to discuss two fundamental factors for success in a VUCA scenario like the one we are in at the moment. Firstly, a predictive model is only one part of the data science project and, as such, will always depend on the correct definition of the questions and proper data quality. Secondly, a correct combination of those models with a human team experienced in the interpretation and communication of the knowledge gained – the data translators – is essential. Together with the essential ethics guide that will accompany us throughout the process, investment will achieve the desired result.