BOSTON: Many businesses are failing to extract actionable insights from the huge volumes of data they track due to the systemic failure of their approach to machine learning, a leading academic has argued.

"If companies want to get value from their data, they need to focus on accelerating human understanding of data, scaling the number of modelling questions they can ask of that data in a short amount of time, and assessing their implications," according to Kalyan Veeramachaneni, Principal Research Scientist in the Laboratory for Information and Decision Systems at MIT.

He made this assertion in the Harvard Business Review where he observed that machine-learning experts and their business counterparts were frequently speaking different languages and had different expectations.

So, for example, when the former complain that "the data is a mess", this often turns out to refer less to its quality than its granularity.

"Machine learning experts are used to working with data that's already been aggregated into useful variables," he noted, "such as the number of website visits by a user, rather than a table of every action the user has ever taken on the site."

And the common business complaint that "we have a lot of data and we are not doing anything with it" masks the fact that the organisational approach to data needs to be rethought.

"Machine learning experts often focus on the later parts of the pipeline – trying different models, or tuning the hyperparameters of the model once a problem is formulated – rather than formulating newer predictive questions for different business problems," said Veeramachaneni.

At the same time, he added, people actually working on the models rarely ask "what value does this predictive model provide, and how can we measure it?"

Veeramachaneni identified four principles which he is applying in The Human-Data Interaction Project at MIT and which he believes will help deliver a better impact frommachine learning: stick with simple models; explore more problems; learn from a sample of data, not all the data; and focus on automation.

"Our target is rapid exploration of predictive models, and to actually put them to use by solving real problems in real organisations," he said. "These models will be simple, and automation will enable even naive users to develop hundreds if not thousands of predictive models within hours – something that, today, takes experts entire months."

Data sourced from Harvard Business Review; additional content by Warc staff