Introduction: Text analysis in the past

The drive for efficiency and automation in market research, coupled with the costly and time-consuming nature of manual coding, makes automated analysis of open-ended text response data an obvious candidate for attention. Indeed, this has long been the case: as Macer and Wilson (2017) and Raud and Fallig (1993) describe, papers on automatic text coding were published as early as the 1990s, and one of the authors recalls unpublished experiments carried out in the early 1980s.

Automated text analysis offers benefits beyond cost savings and reduced lead times. Done successfully, it allows direct and in-depth access to participants' views, expressed in their own words and without the intervention of an interviewer and a coder. Anderson (2014) argues that it eliminates human error and human variability and can be used to create models that are easier to update over time than a manual approach to coding generally yields.

The problems for the early models were the lack of computer power available—they often crashed—the lack of large data sources, and the sheer scale of the challenge in interpreting human language. Clearly, the former two are no longer barriers. The extent to which automated approaches can cope with the nuances of language remains a controversial topic, however, and one that we will focus on later in this article.

Text analysis today