Polling, weighting and other voodoo
Mainstreet Research

Insights

Polling, weighting and other voodoo

Last week, Mainstreet did something it has never done before.

It released the results of a survey we conducted to the public and media. There was nothing unique about how the poll was released: media posted the story at 11pm and our report and details were added to the story and posted to our website at 11am the next day.

What was unique and different about this poll was the sample size for a publicly released poll, averaging over 600 responses per riding, across 18 ridings on the island of Montreal. This isn’t something unique for Mainstreet – our approach is to get larger sample sizes for our clients so we can be more confident in our numbers. We are constantly surprised at polls that we see quoted in the media with minuscule sample sizes or that use online information. This post will attempt to explain why we believe our approach yields better results and why we are more confident in our numbers than in those of other firms.

Online gaming and online polling similarities

My 13 year old niece likes to play online games – she is very good at them. A few weeks ago at her house I watched her take out an entire platoon of American soldiers in some battle game I don’t know the title of. Two things struck me as I watched with some horror: first was that her online gaming profile said she was a 22 year old man named Claude, the second was that the Russian army was invading mainland USA during the WWII era in this game. Technology and simulation amazes me daily, I have to remind myself that it’s not reality.

Most people I know maintain multiple email addresses for various reasons, many have multiple profiles on Facebook and other social media sites for a variety of reasons that I don’t care to speculate on. The online world is full of fake profiles, scams, viruses, spyware and other hazards that make it more dangerous in many ways than the real world.

In this world of fake and multiple profiles, a 13 year old girl posing as a 22 year old Russian soldier can invade mainland USA during WWII. In this same world, Darrell Dix is the Premier of British Columbia and Danielle Smith is the Premier of Alberta. We should remind ourselves that this is not reality. Online surveys are susceptible to the dangers of online spoofing and can never capture random samples, it’s one of the reasons the New York Times won’t publish internet polls.

Size Matters

In the days since we released our poll, a number of people have criticized our methods. Not surprisingly, most of these critics also happen to own and operate polling and market research companies.

Our approach is different than most and I am happy to share and defend it here in as clear and simple a way as possible.

The trend we’ve noticed for the last few years has been toward ever smaller sample sizes, 200-300 per riding and they have yielded catastrophically poor projections in many cases. Even in city or province wide polling, the numbers of responses appear low. Our approach is always toward a larger sample size which makes us more confident in our numbers and I will try to explain why below with an example.

First, it’s important to explain how weighting works. Say you have a poll with one thousand respondents. For the purpose of this example, let’s assume we ask the people responding to the poll their ethnicity and that you have 700 Caucasians responding, with 150 Filipinos, 100 South Asians, and 250 Caribbeans. The latest census information we have in this example shows that the racial breakdown is 71.6% White, 12.3% Caribbean, 12.5% South Asians, and 3.6% Filipinos. To match these demographics, the data would then be weighted in the following manner:

The results from Caucasian respondents would be divided by 70.0 then multiplied by 71.6;

The results from Caribbean respondents would be divided by 25.0 then multiplied by 12.3;

The results from South Asian respondents would be divided by 10.0 then multiplied by 12.5; and

The results from Filipino respondents would be divided by 15.0 then multiplied by 3.6.

In the case of many market research and polling companies, the smallest possible samples are surveyed to achieve what they feel is a reasonable margin of error. We don’t dispute the statistical accuracy of these polls or how sound their math is. The problem is that the numbers do not end up matching results in nearly every case, so there is obviously a disconnect between the math and how it translates to real world outcomes. A number of factors lead to this in our opinion as follows,

1) Census data is often incorrect and/or outdated

2) Voter turnout doesn’t correlate nice and neatly to the demographics

3) Shadow populations are increasing across most urban centres

4) Transitional populations are increasing as economies and jobs shift

In response to these challenges and others, many polling firms have tweaked and continue to tweak their weighting methods and Mainstreet is no different. However, we also believe in increasing our sample size. There is a lot of anecdotal evidence from recent elections that all the math, weighting and tweaking leads to what amounts to be pretty bad guesses. Put simply, the guessers are really, really smart and putting money in to hiring smarter and better guessers is a great idea. Mainstreet also prefers to put money in to doing more surveys with real people, despite the increased cost and time needed it leads to results that are more reflective of ballot results. I won’t win any arguments over whether this approach is “better” with my competitors, mathematics tells us that all those polls and the methodology is perfect. The results of the last few elections however tells us that as perfect as these polls are mathematically, they do not translate well to ballot results, which in our opinion is the point.

To be perfectly clear, we don’t blame the statisticians or the math, it’s the data and assumptions that are flawed. We just don’t believe in burying our heads in the sand, ignoring the realities and sticking to old methods. We understand how change is scary and we don’t expect everyone will understand or like our approach.

Why I blame Obama and the media

The media has brought a lot of attention to the election wins of President Barack Obama in 2008 and 2012. Social media, digital strategies and big data have dominated the headlines. What the media haven’t reported as much about the Obama wins of 2008 and 2012 is the huge organizational efforts that were made on the ground, the huge teams of volunteers and staff that directed them. Big data, social media, micro-targeting, etc are all very useful tools but not the only tools. Without ground game and organization, there is no human context to the data.

This singular focus on data, social media, data and social media analysis has led to the recent trend in polling. Online and smaller sample sizes, weighted for ever more demographic factors, tweaked and re-tweaked but no closer to real world outcomes.

Why clients choose Mainstreet

Mainstreet prefers to do the work and spend the time and money to reach more real people. That is why we insist on bigger sample sizes and not using online sources. We use IVR and live agent calling and combinations of automated and live contacts to reach people. We won’t criticize our competitors who choose other methods because we know their methods are equally mathematically relevant but we are proud to stand behind the results of our polls and our accuracy.

Quito Maggi is the president of Research.