Sampling v. scale in the internet age: An investigation of the tension between convenience sampling, response rates, probability, and coverage
Since the inception of rigorous survey work, researchers have been forced—by time and budget constraints—to rely on a slew of sampling methods to estimate population parameters. Sampling is, of course, faster and cheaper than interviewing an entire population of any size, regardless of its possible pitfalls. However, the frames have to a) have nearly 100% coverage of the population of interest and b) ensure an equal chance of selection for each member, and c) feature a reasonable response rate for the laws of sampling to work. Convenience samples—most famously those of the Literary Digest that dramatically yielded erroneous predictions about the 1938 election—have long been relegated as inadequate for serious research. These days, though, collecting opinions from probability samples takes longer than conveniently interviewing a convenience sample of a couple million people. The question of whether the interviews are capable of indicating—not representing in a probabilistic sense—the opinions of a population depends on whether the source of the interviews is sufficiently similar to the population on a number of key demographics.