Big Data: Cut through the noise

Nick Barthram

Humans are programmed to see patterns even where they don't exist, so the combination of Big Data and human inference can lead to spurious correlation rather than genuine insight. Nick Barthram of Indicia offers seven steps to help find that insight.

With so many combinations inherent in Big Data, it's inevitable that many patterns will begin to emerge showing strong correlations. However, in reality, the vast majority of these patterns will just be noise, worse than redundant when looking for insight as they actively lead you into making false conclusions; what statisticians call 'spurious correlations'. Statistically speaking, your chances of finding randomness masquerading as correlation increase faster than the amount of valid correlations, creating a significant gap between what we actually know and what we think we know.