Reducing Undecided Voters and Other Sources of Error in Election Surveys
Kevin J. Flannelly
Marketing Research Institute, Honolulu, Hawaii
Laura T. Flannelly
University of Hawaii, Honolulu, Hawaii
Malcolm S. McLeod, Jr.
Center for Psychosocial Research, Honolulu, Hawaii
The present study found the number of undecided voters on forcedchoice questions about candidate preferences was roughly three times higher than that on subjective probability questions, and that election predictions based on traditional forcedchoice scales had a higher degree of error than predictions based on subjective probability scales. The findings show that subjective probability scales can introduce error when there are more than two candidates or parties in an election, but this can be easily corrected by the procedure for adjusting subjective probability scores that was used by Hoek & Gendall (1993). While the use of adjusted probability scores improved the accuracy of predictions in multicandidate races, no difference was found in the accuracy of predictions based on adjusted and unadjusted probabilities in elections with only two candidates.
INTRODUCTION
Recent research has examined the use of subjective probability scales in political polling (Flannelly et al. 1998, 1999, 2000; Hoek & Gendall 1993). Instead of asking people for whom they would vote if the election was held today (Crespi 1988; Gelman & King 1993; Erikson & Sigelman 1995), subjective probability scales can be used to ask the question, Who are you likely to vote for on election day? Casting the question in terms of future intentions allows pollsters, politicians and political scientists come much closer to asking the real question they want answered: Who will you vote for on election day? It also overcomes the objection that preelection polls are merely snapshots in time that should not be expected to predict the future (Worcester 1992).
The reason most political polls ask Who would you vote for if the election was held today? instead of Who will you vote for on election day? probably stems from the concern that many voters would answer the latter question by saying they were undecided, unless the poll was conducted very close to the day of the election. A glimpse of the magnitude of the problem is provided by Gelman & Kings (1993) analysis of a poll that was conducted 15 weeks before the 1988 US Presidential election. When asked the forcedchoice question, Which presidential candidate will you definitely vote for in this years election? roughly 5070% of the people surveyed said they were undecided, depending on their demographic characteristics. In some of our own research, we have observed an average undecided rate of 23.6% on the traditional forcedchoice question, If the election was held today, who would you vote for? When voters in the same surveys were asked On a scale of 1 to 10, how likely are you to vote for each candidate on election day? we observed an average undecided rate of only 8.1% (Flannelly et al. 2000).
Taken together, these results suggest that subjective probability scales can reduce the undecided rate in election surveys. Furthermore, since polling organisations usually make an effort to estimate the likely voting behaviour of people who do not say for whom they would vote (Hoek & Gendall 1993; Voss et al. 1995; Curtice 1997b), we share Hoek & Gendalls belief that subjective probabilities provide both a simpler and a more direct approach to measuring voters likely behaviour on election day.
Recent research also indicates that people use subjective probabilities more accurately when rating the probability of voting for two candidates than they do when rating three or more candidates (Flannelly et al. 1999). Generally, when people are asked to assign subjective probabilities to the likely occurrence of two alternatives, the assigned probabilities tend to complement each other, with the two probabilities summing close to 100% (Tversky & Koehler 1994; Flannelly et al. 1999, 2000). When people assign subjective probabilities to three or more alternatives (or candidates), however, they tend to overstate the probability of each alternative, and the sum of the probabilities exceeds 100% (Flannelly et al. 1999). Hoek & Gendall (1993) encountered this problem when asking voters to rate their probability of voting for each of several political parties. To correct the problem they adjusted voters probabilities of voting for each party by dividing each probability by the sum of the probabilities of voting for all the parties.
The present study compared:

the accuracy of predicting the winner in 2candidate and multicandidate races using adjusted and unadjusted subjective probabilities;

the accuracy of election predictions based on voters answers to subjective probability and forcedchoice questions about voter intentions; and

the difference in undecided responses to subjective probability and forcedchoice questions.
METHOD
Data were analysed from 23 preelection surveys conducted by telephone of random samples of registered voters. The sample size of the surveys ranged from 150 to 575 voters. The time interval between the surveys and their respective elections ranged from less than one week to almost six months, with roughly half of the surveys conducted within three weeks of the respective election. Thirteen of the elections were 2candidate races while ten had 3 or 4 candidates.
The questionnaires used in all 23 surveys included an item that asked voters who they were likely to vote for on election day. The typical wording was: On a scale of zero to ten, how likely are you to vote for each of the following candidates on election day? Voters were instructed to: Say 10 if you are certain you will vote for them. Say zero if you are certain you will not vote for them. Or choose a number in between. Nine of the 23 surveys also included the traditional voting intention question: If the election was held today, who would you vote for? Details about the question procedures are provided by Flannelly et al. (1998).
Three dependent variables were directly measured:

frequency counts for each candidate on forcedchoice questions;

ratings for each candidate on the likelihood (subjective probability) questions; and

number of voters who failed to answer the forcedchoice or probability questions (i.e. undecided voters).
The number of undecided voters was converted to a percentage by dividing it by the total number of people in the survey. Frequency counts for each candidate on the forcedchoice question were also converted to percentages, omitting undecided voters from the percentage calculations.
Subjects likelihood ratings were handled in two ways to convert them to adjusted and unadjusted probabilities of voting for each candidate. To obtain an adjusted probability of voting for a candidate, the likelihood rating for that candidate was divided by the sum of the likelihood ratings for all the candidates. This value was then multiplied by 100 to yield a percentage the adjusted probability of voting for the candidate. The unadjusted probability of voting for each candidate was obtained simply by multiplying the likelihood rating by 10 to convert it to a 0100% scale, comparable to the percentages in the adjusted probability and the traditional forcedchoice measures.
Predictions of the election results were based on:

the mean, or average, subjective probability (adjusted and unadjusted) of voting for each candidate, omitting undecided voters; and

he percentage of voters who chose each candidate on the forcedchoice questions, omitting undecided voters.
For simplicity, we examined the accuracy of election predictions only for the candidate who actually won the election. Two measures of the accuracy of the predictions were calculated: mean error and mean absolute error. Mean error was calculated by subtracting the percentage of votes received in each election by the percentage of votes predicted by its respective survey, and dividing by the number of surveys included in the analysis. The calculation of mean absolute error was identical, except it ignored the sign of the difference between the actual and the predicted results (Hoek & Gendall 1993; Erikson & Sigelman 1995). The mean absolute error is the more meaningful of the two measures, since it shows the magnitude of prediction errors regardless of their sign (Hoek & Gendall 1993; Erikson & Sigelman 1995). All data were analysed by analysis of variance and/or ttests.
RESULTS
Adjusted versus unadjusted probabilities
Table 1 presents the mean error and the mean absolute error of election predictions based on adjusted and unadjusted subjective probabilities of voting for candidates. Although no significant differences were found in the accuracy of predictions based on adjusted and unadjusted probabilities in 2candidate races, adjusted probability scores produced significantly superior predictive accuracy for both measures in elections with 3 or 4 candidates, compared to unadjusted scores. In elections with 3 or 4 candidates, the unadjusted probabilities overestimated the votes for the winner by 3.0% (P < 0.001), compared to the adjusted probabilities which yielded an absolute error of 6.8% (P < 0.002).
TABLE 1: DIFFERENCE BETWEEN ACTUAL ELECTION RESULTS AND PREDICTIONS BASED ON ADJUSTED AND UNADJUSTED SUBJECTIVE PROBABILITIES OF VOTING FOR THE CANDIDATE WHO WON THE ELECTION
Number of candidates  Adjusted probabilities  Unadjusted probabilities  
Mean error  2 3 or 4 
1.9 4.0 
2.6 +3.0 
Mean absolute error  2 3 or 4 
4.9 5.0 
4.9 6.8 
Forcedchoice versus probable choice
Predictions based on forcedchoice questions yielded a mean absolute error of 7.4% in the nine surveys that asked both forcedchoice and subjective probability questions. This percentage was significantly higher than the mean absolute errors produced by adjusted (4.1%) or unadjusted (2.4%) subjective probabilities in the same nine surveys (P < 0.05). No significant differences were found between the mean absolute errors produced by adjusted or unadjusted probabilities, and no significant effects were found with respect to the mean prediction errors resulting from the three types of scores.
Reducing undecided voters
The percentage of undecided voters on the forcedchoice question in the nine surveys ranged from 17% to 32%, for a mean of 29%. In the same nine surveys, the percentage of undecided voters on the subjective probability question ranged from 4% to 21%, for a mean of 11%. Overall, then, voters were nearly three times as likely to say they were undecided on forcedchoice questions than on subjective probability questions (P < 0.001).
DISCUSSION
The present results confirm those of earlier studies, indicating that:

voters subjective probabilities of voting for different candidates produce accurate election predictions (Hoek & Gendall 1993; Flannelly et al. 1998, 2000); and

predictions based on subjective probabilities are more accurate than those based on responses to forcedchoice questions about candidate preference (Flannelly et al. 1998, 2000).
The current findings also confirm results from previous studies indicating that voters are more likely to answer a probability question than they are to answer the traditional forcedchoice question about their intention to vote for different candidates or parties (Hoek & Gendall 1993; Flannelly et al. 1998, 2000). In the present study, the number of undecided voters on the probability questions was roughly a third of that on traditional forcedchoice questions.
Studies have reported that a polls predictive accuracy is inversely related to the proportion of undecided voters (Crespi 1988; Lau 1994). Undecided voters can contribute to two sources of error in election predictions. First, they reduce the sample size on which sampling error is calculated, and second, they can bias predictions when undecided voters are not equally distributed among the supporters of all candidates or parties (Crewe 1993; Lau 1994; Curtice 1997a). Therefore, reducing the number of undecided voters decreases two potential sources of prediction error.
Like Hoek & Gendall (1993) we found that voters tended to assign probabilities to each candidate that summed to more than 100% when there were three or more candidates in an election. This poses another source of error when making election predictions unless something is done to adjust voters subjective probability scores. Our analysis of 23 election surveys confirms the value of Hoek & Gendalls (1993) adjustment method for this purpose. Though we found no difference between predictions based on adjusted and unadjusted probability scores in 2candidate elections, unadjusted scores produced significantly higher prediction errors in elections with 3 or 4 candidates. Since parliamentary races often involve several political parties, predictions of such elections would typically require the kind of adjustment that Hoek & Gendall (1993) made to their subjective probability scores.
We have found that subjective probability scales are particularly useful and convenient for conducting telephone polls. Given the move towards using telephone polls in Britain (Curtice 1997b), we expect that British survey organisations would find subjective probability scales to be equally valuable and convenient for conducting political polls. Based upon the findings of past studies that have used probability scales in market research (Pickering & Isherwood 1974; Gendall et al. 1991), we believe such scales can be adapted widely for political and opinion polling.
REFERENCES
Crespi, I. (1988) Preelection Polling: Sources of Accuracy and Error. New York: Russel Sage.
Crewe, I. (1993) A nation of liars? Opinion polls and the 1992 election. Journal of the Market Research Society, 35, 4, pp. 341359.
Curtice, J. (1997a) Are the opinion polls ready for 1997? Journal of the Market Research Society, 39, 2, pp. 317330.
Curtice, J. (1997b) So how well did they do? The polls in the 1997 election. Journal of the Market Research Society, 39, 3, pp. 449461.
Erikson, R.S. & Sigelman, L. (1995) Pollbased forecasts of midterm congressional election outcomes: Do the pollsters get it right? Public Opinion Quarterly, 59, pp. 589605.
Flannelly, K.J., Flannelly, L.T. & McLeod, M.S., Jr. (1998) Comparison of election predictions, voter preference and candidate choice on political polls. Journal of the Market Research Society, 40, 4, pp. 337346.
Flannelly, L.T., Flannelly, K.J. & Mcleod, M.S., Jr. (1999) Judgment certainty using forcedchoice and subjective probability scales. Paper presented at the annual meeting of The Psychonomic Society, Los Angeles, CA, 20 November 1999.
Flannelly, L.T., Flannelly, K.J. & Mcleod, M.S., Jr. (2000) A comparison of forcedchoice and subjective probability scales measuring behavioral intentions. Psychological Reports, 86, pp. 321332.
Gelman, A. & King, G. (1993) Why are American presidential election campaign polls so variable when votes are so predictable? British Journal of Political Science, 24, pp. 409451.
Gendall, P.J., Esslemont, D. & Day, D. (1991) A comparison of two versions of the Juster Scale using selfcompletion questionnaires. Journal of the Market Research Society, 33, 3, pp. 257263.
Hoek, J.A. & Gendall, P.J. (1993) A new method of predicting voting behaviour. Journal of the Market Research Society, 35, 4, pp. 361373.
Lau, R.R. (1994) An analysis of the accuracy of trial heat polls during the 1992 presidential elections. Public Opinion Quarterly, 59, pp. 589605.
Pickering, J.F. & Isherwood, B.C. (1974) Purchase probabilities and consumer durable buying behaviour. Journal of the Market Research Society, 16, 3, pp. 203226.
Tversky, A. & Koehler, D.J. (1994) Support theory: a nonextensional representation of subjective probability. Psychological Review, 101, 4, pp. 547567.
Voss, D.S., Gelman, A. & King, G. (1995) Preelection survey methodology: details from eight polling organizations, 1988 and 1992. Public Opinion Quarterly, 59, pp. 98132.
Worcester, R.M. (1992) The performance of the political opinion polls in the 1992 British general election. Marketing and Research Today, November, pp. 256262.