Tuesday, October 23, 2012

Is my Methodology Valid?

Last night I had a brief Twitter discussion with Ace regarding whether or not what I am doing has any validity.  To be fair, he is being reasonably skeptical of whether or not poll rebalance is viable, and can even be done.  Do we have enough information to even come up with valuable conclusions?  Here is his final tweet from last night:

I believe most polls are tilted too Democrat.  What I doubt is that is is possible to post-facto rebalanced missing data.  - Ace

Let me make my argument why I believe this is a valid methodology.  The be sure, we won't know if I am right until the election.  This is when we will know the actual vote totals and the partisan split of the electorate.  However, I think we have some data that gives indication that this methodology  might be valid.

First of all, my basic belief is that elections are based on two factors only.
  1. How many partisans of each side can the parties get to the polls?
  2. What is the opinion of the non-partisans regarding the two candidates?
Democrats vote for Democrats or don't vote.  Republicans vote for Republicans or don't vote.  Independents will vote for one or the other candidate, based on the persuasiveness of the arguments and the facts on the ground (e.g. I hate the war, the economy sucks, I want my free stuff).

I fundamentally do not believe that Democrats voting for Romney, or Whites supporting Obama are of any practical value in predicting voter behavior.  People self identify as either partisan (which does not mean they are necessarily registered in their preferred party), or non-partisan and willing to change their vote election by election.

Partisans move from one party to another slowly, and once they move it is a personal decision based on core philosophy.  They will not change their minds based on the candidate and vote for the other team.  But they might choose not to vote, which is reflected by enthusiasm.

Enthusiasm is ultimately determined through turnout.  If Republicans are more enthusiastic than Democrats, then they will show up at the polls in higher numbers.  This was demonstrated in 2008.  As I've mentioned before, most people misunderstand the 2008 election.  They think 7% more Democrats showed up to vote for Obama.  This isn't true.  Democrats were able to get their base to enthusiastically show up at 2% higher rate than in 2004 (which was also a good turnout for them).  The real difference is that 5% of Republicans didn't bother to vote, because McCain didn't enthuse them, and McCain did not run an effective GOTV operation.  The final factor is that non-partisans supported Obama overwhelmingly.

The other day, Rush discussed his belief that people don't change their minds as quickly as the polls seem to indicate.  He believes that core voting decisions are established, and that the voters do not swing wildly between the candidates.  I think he is right on this.  Poll fluctuations we are seeing are based on the sampling variations, and the fact that RCP does not correct for these fluctuations and skews.  They are comparing apples to oranges, and trying to tell us that we really were expecting a fruit salad.

I also have a problem with RCP in that it can be easily "gamed".  A few polls that are purposely biased can skew the average and show momentum for one of the candidates.  We have proof that the Obama campaign is purposely disabling credit card validation in their online fundraising operations.  Why is it so hard to believe that some polls are being purposely manipulated to affect the RCP average?

My argument to Ace is that the poll internals do provide all of the information needed to get a good view of what the election results will be given a specific turnout model.  They provide the partisan split in the votes, and they (usually) provide the non-partisan preference between the candidates.  Using this information, it is possible to adjust the poll results such that all polls averaged can be compared using the same turnout baseline.  We are comparing apples to apples, and getting applesauce.

Consider the following chart:

This chart tracks the RCP average, the Rasmussen Daily tracking poll, and my 2010 turnout average from Sept 26th to October 23rd.  Notice a couple trends.  From the 26th to the 1st, Romney was gradually sliding, until the first debate.  After the debate, his support began to move up, and then on October 13th we hit a very stable spot in the race.  In the 2010 turnout model, the Romney lead stayed very constant for 9 days.  During that period, there were no real changes in the state of the race, no gaffes, nothing that was really changing the momentum.  However, during that same period, Rasmussen stayed pretty even, bouncing between R+1 and R+2, while RCP began to drop down to an Obama lead.

My point is that there was no reason for this drop, other than the new introduction of polls with widely divergent samples.  The reweighted polls showed stability in the race, that reflected the our general sense of where the race stood during that time.  The RCP average did not.

Now I am not saying that the election will resemble 2004 in turnout.  I am only saying that rebalancing to potential turnout scenarios gives a better picture of the state of the race over time.  This is why I'm offering 5 different turnout models.  Voter enthusiasm and GOTV operations will determine which of the models ends up being the valid view of the election.

What my models do provide is a good view of what results we can expect if the electorate resembles a specific model.  Since we know that 2008 level turnout is highly unlikely, we can then say "If we see a D+3, then the polls say this" and "If the GOP repeats the 2004 turnout, then the polls say that".  I find this more useful that looking at a D+9 poll that says Obama is ahead by 5, calling it BS and ignoring it.  It also helps in places like Ohio, when we get a series of 7 out of 10 polls that sample Democrats between D+6 and D+11.  We are able to get value out of the polls anyway, rather than ignore them and getting a badly skewed view of what is really happening in Ohio.


  1. Dave,

    I think your method is very logical, and the proof is when you analyze the different polls from different firms, they all seem to "say" the same things, the variation comes almost purely from what turnout model you're using.

    I don't think it's an "exact" formula (ie more Democrat voters turning out doesn't necessarily mean more Obama votes, but more than likely) but I think it gives a MUCH more accurate picture of the race than ignoring this factor and just accepting whatever absurd model the pollster is pushing. If we accept that Democrat turnout is greater than 2008 and something like +9 compared to Republicans, the campaign and candidates essentially become irrelevant because it would be nearly impossible to overcome that makeup of the electorate.

    The real polling "bombshell" than no news outlets seem to be talking about is the fact that Romney is absolutely dominating when it comes to voters that identify themselves as independents. To me, that's the biggest sign that Obama is likely finished.

    1. "The real polling "bombshell" than no news outlets seem to be talking about is the fact that Romney is absolutely dominating when it comes to voters that identify themselves as independents. To me, that's the biggest sign that Obama is likely finished."

      Yeah. +9 in PPP, +19 in Monmouth, +11 in IBD/TIPP. Strangely -4 in CBS but that's another reason for thinking there's something very wrong with that poll.

  2. Excellent analysis Dave. I think you are on to something, Jay Cost parlayed election analysis into a writing gig in 2004, maybe you will be the next election day star.

  3. Dave:
    Your analysis and specially, laying out all the turnout scenarios is very logical.

    Any poll that says romney is winning independents and then says that obama is leading ..Does not pass the common sense test.

    Rcp and nate, imo ..are analyzing wrong inputs. You can do all kinds of analysis on data ..but as they say, Garbage In, Garbage out. Mo matter how sophisticated one's analytical methods and historical performance..it has to pass the common sense test.

    Thanks for diligently doing this work. Very useful

  4. I agree -- seems to me the race was never really shifting as much as poll top-lines might have suggested, and is hugely determined by the deeper, slower-moving factors of turnout+indies you point to.

    It seems to me that what media cycle shifts *may* produce pretty quickly are shifts in the *poll* turnout, which could be why (e.g.) we suddenly started to see Rs represented in Gallup. With response rates in the single digits, just a fraction of GOPers deciding that it's safe to register one's preference to the hostile media can probably turn these numbers from nonsense into sense.

    Another point in your favor is that Ras uses this exact sort of turnout normalization (though to a *single* fudged number)... and has good predictive success with it. You're just putting all the other pollsters on the same field with Ras.

  5. I want to believe...

    But I just don't understand how we can get from "poll toplines show X" to "poll toplines are actually Y" without any historical evidence that RCP's poll-of-polls methodology is flawed. It was within a point of the actual result in 2004 and 2008 (nat'l popular vote).

    However, state polls were more variable in 2008. 2008 final result variance from last RCP poll average in battleground states: IN D+2.5 MO D+0.6 OH D+2.1 FL D+1.0 VA D+2.1 for an average 1.6 point variance. That's more significant, but not by much.

    Also, a lot of the late swing in 2008 was to undecideds breaking heavily for Obama; but on this day in 08, there were 7.3% undecided compared to 4.9% now.

    I guess what I'm saying is I haven't seen any compelling historical evidence to suggest that poll results will not reflect eventual turnout.

    1. The flaw in the RCP average is that you could submit the same data, worked over by different assumptions about turnout to RCP and their average would move all over the map.

      Take any two of Dave's 'corrections' and average them:

      2010 + 2004 = big Romney win
      2008 + D+5 = Obama win.

      but have added any information in the averaging - no, and worse than no.

    2. I don't think you got my point. I understand that you can project a Romney victory instead of an Obama victory if you assume a different turnout model, and I tend to agree on a gut level that Romney's turnout will be significantly better.
      What actual evidence do we have that turnout will be different, however? Why, exactly, do all the D+X samples keep popping up? Do R's just not talk to pollsters as often as D's?

    3. I can think of four reasons.

      1) The response rate is now so low that it is inducing errors in the sampling. Democrats are much more motivated to tell pollsters how much they want to vote for Obama. Republicans want to be left alone.

      2) Pollsters have a built in bias toward a specific turnout that they expect, and are adjusting their sample accordingly. They keep sampling until they have a sample split that matches what they expect the electorate to look like.

      3) Pollsters are actively working with the Obama campaign to boost his numbers to make him look inevitable by skewing their samples.

      4) Respondents are lying to the pollsters.

      Pick your desired level of cynicism and paranoia.

    4. Thanks for this.

      1 feels most likely, though I suspect it's not party ID but another correlating factor. (age?) 2-3 require conspiratorial intent, which is always possible but IMO doubtful. 4 is also likely, though of course impossible to quantify.

      Either way, I find it hard to believe any of these, because you would have seen a consistent pattern of polls/poll averages being off in favor of the Democratic candidate in '08/'10, and that doesn't appear to be the case.

  6. I think I'll be able to take a close look at this later, but I'm already impressed that Ace has been so cautious about this, and that you have chosen to defend your claims with arguments, rather then (as others might) get defensive and worse. It's all a very good sign. That, and some things about these polls just "smell bad".

    1. Well, I respect Ace's instincts. He's not a math guy, I am. But his "feelings" about things are usually pretty astute. So when he was critical, I decided to look deeper into my results to see if I could defend my methodology. Again, proof will come after election day. But that 9 day trend line is a strong indicator of statistical stability.

      I started doing this to dispense with BS, not to add to it :)

    2. OK, I finally had time to go through this, including the comments. You've outlined some of the basic assumptions, and they seem sound, as far as they go. You also show a chart where your output makes more sense than what we see elsewhere (and the answer it comes up with is especially satisfactory!)

      But the validity of this depends on the details, and getting the details wrong on something like this - where the spread is such a small fraction - could lead to erroneous results.

      So the validity depends on what data is available for you to use (what are the inputs?), and what operations you perform on the data to combine them into a new result. Without a decent description of what you're actually doing to a greater level of detail, I can't tell if your results give the best possible answer, or not. There are right ways and wrong ways to do statistics, and I've seen people do it wrong lots of times. It's not for the faint of heart! Realistically, it should only take a few equations that aren't all THAT complex.

      -Optimizer (A number-crunching fiend)

  7. Your method has some validity if the only, or major skewing they are doing is a final adjustment of the partisan split.

    If they are doing something trickier, then your method might not help at all.

    1. That actually isn't quite true.

      So, the explanation need some set-up, so hang with me!

      Imagine you have a square that's subdivided into red and blue pixels of unknown relative amounts and we want to know the average color over the square. We can't poll every pixel, so pollsters must then make educated guesses about what regions of the square are most representative and what sampling patterns to use to decide which pixel to ask, are you red or blue?

      This is the sampling stage. We really need to look at polling analysis as a pipeline of filtering and weighing that's applied. At all times, the goal is to minimize the realm of possibilities, the possibility space that a potential outcome could exist in.

      But, in the sampling stage, it basically just acts as an input taking in the raw data that's being sampled from an underlying unknown population: in our example asking a few pixels spread around the square what color they are.

      Pollsters then take these color values, correct them a bit and average them to find the square's inferred color and go to the bank.

      Dave is addressing a secondary filtering stage. Basically, he's looking at the correlations between different elections, so the patterns of red and blue pixels in different squares from different years. This is what it means when I say he's transformed the data into partisan affiliation space: it's when you take the 2D square of pixels from this year, stack and align the last 4 or five year's squares on top to form a volume (cube) and then start looking for correlations across them that we can use as a source of additional information to further constrain the possible space this year's average will be in. We can do this because partisan/party ID is very stable, for a number of reasons -- basically people's ideology and psychology are set early and evolve slowly.

      So, in summery, doing something tricky isn't really a problem in as far as the data is filtered out by simple filtering in the first stage and his work is indifferent to that as Dave basically assumes that each polling firm is relatively 'fair' and their data has usable information in it.

      NB: That last point is actually interesting; as long as they are actually reliably polling people, we don't want all the pollsters to be the same. We want them all to poll a little different. This is actually optimal. Nate Silver messes up what is effectually a stochastic sampling algorithm up by correcting for a 'House Effect,' but I already went on too long.

    2. This is a good explanation, I wish I had thought of it :)

    3. If all they are doing is a scaling manipulation of the raw data, then you can undo it with no loss.

      But if they actually manipulate the raw data instead, line oversampling kinds of republicans or independents, you might be able to see in other crosstabs that the data is odd, but you wouldn't be able to fix it in this way.

      I think this is what Ace was implying about 'lost' information.

      For example, in the recall election here in CA years ago the LA Times had a +20 poll for the Democrat, Cruz Bustamonte. Arnold wiped them out, and I doubt you could find the 'real' result by shifting D/R/I. Everyone wanted him gone, and Arnold crushed him.