Whether attempting to capitalise on a short-term fad, or redirect investment toward a long-term trend, a startup has developed a technique to find the signals in vast quantities of unstructured social data that claims 90% accuracy at a six-month projection – here’s what we learned.

Black Swan is a data-mining startup that works with big brands to predict product-based trends by deploying natural language processing techniques of massive amounts of unstructured data from social media, publishers, blogs, as well as historical records, which help it to model performance.

It effectively gives brands a category-level view of the themes, ingredients, and ideas that are growing in popularity and the relative rate at which they’re growing. (For WARC’s in-depth report, read Black Swan: How one startup finds signals of the future in the digital noise)

While it sounds like advanced social listening, the techniques that it has developed since its founding in 2011 are built off early bespoke efforts for brands like Disney for whom Black Swan were able to help predict the popularity of the animated film Frozen, and help the company plan its marketing and further product line investments.

Since then, the company doubled down on its core product of understanding and modelling unstructured social data to help companies with their positioning and innovation strategies.

“When we originally started, we were using the same datasets – things like Twitter, but we just used it in a very rudimentary way,” King told WARC. “In the old days we used to just look at the amount of times something was mentioned” – for instance, the word ‘orange’ – “and we would count it and then say, look, people are saying it quite a lot!”

Black Swan’s method is bottom-up, taking in data from across a topic or category and then placing it in a visual context. The data, once purged of irrelevance and noise, are submitted to clustering analysis and NLP (natural language processing) tools to impose a structure on the information, based on the words that real people are using to talk about the trend, and map the total conversation. Relevant terms, and the trends they indicate, are then assigned a value which helps users to understand its maturity and its future growth potential.

For brands, the use of this is clear:

  • first, in identifying market opportunities associated with the brand and help to fine-tune messaging toward emerging trends;
  • second, it can shine a light on the innovation funnel and help to steer its direction.

“You can predict a fad up and you can also predict the fad down,” explained King. Understanding that something is going to be quite big is useful, but if the initial usefulness of knowing when to build up a supply chain – “a million-dollar plus decision” – leads to a medium-term failure when the fad fizzles out, knowing when to turn off investment is crucial.

Sourced from WARC