Projection vs Reality and Analysis.

Now that the election is over, and I have had an opportunity to decompress a little bit from my disgust and dismay at seeing yet another person named Trudeau elected to the PMO to ruin our country, it’s time to put on my analytical hat and see how well my Patented Projection Prognosticator worked towards the election this time around.

First, the qualitative:

Projected outcome:	Liberal Victory
Actual outcome:	Liberal Victory

Okay, good so far. Canada is worse-off for it, but my projection was accurate.

Projected Parliament:	Liberal minority
Actual Parliament:	Liberal majority

Okay, not so good there. Majority vs Minority is a huge miss, but, to be fair, the pollsters were projecting the same kind of outcome as me, and since this is an algorithm, I guess it comes down to “garbage in = garbage out.”

Digging a bit deeper, recall how I offered a set of possible results: Assured, Probable, Possible, and Long Shot. Comparing those numbers we get this:

Comparison: Projected results vs Actual Results
(Expand the Items to view the tables)

[simnor_tabs]
[simnor_tab label=”National”][table id=E42-Seats-Nat /]
[/simnor_tab][simnor_tab label=”Atlantic”][accordion openfirst=”true”]
[accordion-item title=”Newfoundland and Labrador”][table id=E42-Seats-10 /]
[/accordion-item][accordion-item title=”Prince Edward Island”][table id=E42-Seats-11 /]
[/accordion-item][accordion-item title=”Nova Scotia”][table id=E42-Seats-12 /]
[/accordion-item][accordion-item title=”New Brunswick”][table id=E42-Seats-13 /]
[/accordion-item][/accordion]
[/simnor_tab][simnor_tab label=”Quebec”]
[table id=E42-Seats-24 /][/simnor_tab][simnor_tab label=”Ontario”]
[table id=E42-Seats-35 /][/simnor_tab][simnor_tab label=”Prairies”]
[accordion openfirst=”true”]
[accordion-item title=”Manitoba”]
[table id=E42-Seats-46 /][/accordion-item]
[accordion-item title=”Saskatchewan”]
[table id=E42-Seats-47 /][/accordion-item]
[accordion-item title=”Alberta”][table id=E42-Seats-48 /][/accordion-item]
[/accordion][/simnor_tab][simnor_tab label=”British Columbia”]
[table id=E42-Seats-59 /][/simnor_tab][simnor_tab label=”North”]
[accordion openfirst=”true”]
[accordion-item title=”Yukon”]
[table id=E42-Seats-60 /][/accordion-item]
[accordion-item title=”Northwest Territories”]
[table id=E42-Seats-61 /][/accordion-item]
[accordion-item title=”Nunavut”]
[table id=E42-Seats-62 /][/accordion-item]
[/accordion][/simnor_tab][/simnor_tabs]

Assured: The leading candidate in the projected result is further ahead than any other candidate beyond the poll’s margin of error.
Probable: The leading candidate in the projected result is further ahead of at least one other candidate by less than the poll’s margin of error.
Possible: The candidate for a given party has a range of possible support where at least some of the range shows a possible win.
Long Shot: Same as “Possible” however the candidate is ranked behind at least one other candidate with a possible win.
Relative Error measures how far from the actual result the prediction was. A negative number means the projection was low, and a positive number means it was high. It is calculated using this formula, which eliminates division by zero errors: $latex Error_{relative}=\frac{projected – actual}{338 + actual}\times 100\%$

As you can see, the results were all within the range that I projected for each party.

This means that, while the Uniform Distribution Method of predicting seats (look it up), is a good estimation tool, it still can only be as good as the polling data available.

To get a better look at the accuracy of the predictor, let’s compare the popular vote in each region against the projected vote. To do this, we will have to compare percentages rather than actual votes, because the number of votes cast between 2011 and 2015 is different.

Remember as well, the projection is based on aggregating four polls, all released on October 17:

Ekos, Forum, Mainstreet, Ipsos, so the projected result for each riding below will be based on that aggregation, with the aggregated margin of error, calculated this way:

$latex \sigma(region) _{total}= \sqrt{\sigma(region)_{Ekos}^2 + \sigma(region)_{Forum}^2 + \sigma(region)_{Mainstreet}^2 + \sigma(region)_{Ipsos}^2} &s=2$

The aggregate support for each region is calculated like this.

Assume the Polls are identified like this:

1. Ekos
2. Forum
3. Mainstreet
4. Ipsos

In reality it doesn’t really matter what order the polls are in, just that they all get included…

$latex Percent Support_{region,aggregate} = \frac{\sum\limits_{Poll = 1}^4(Percent Support_{(party,region) Poll} \times Sample Size_{Region,Poll})}{\sum\limits_{Poll = 1}^4 (SampleSize_{region,Poll})} &s=2$

This gives us the aggregated percent support for each party in each region, from which we can then figure out the results for each individual constituency.

For the purposes of the Prediction Machine, when aggregating the results, I had to make sure that the results for each polling firm were consolidated into like regions, which was a quick and simple calculation.

So we end up with aggregated data for:

Atlantic
Quebec
Ontario
Manitoba/Saskatchewan
Alberta
British Columbia/Northwest Territories/Nunavut/Yukon Territory.

The territories really have no polling data to speak of, since their population is so sparse, so I just grouped them into the BC results. There wasn’t much impact either way.

So, with that in mind, here’s the regional results compared to the aggregated projection I posted on October 17th, for each party, along with the error in the seat projections.

[simnor_tabs]
[simnor_tab label=”National”]
[table id=E42-Vote-National /][/simnor_tab][simnor_tab label=”Atlantic”]
[table id=E42-Vote-Atlantic /][/simnor_tab][simnor_tab label=”Quebec”]
[table id=E42-Vote-Quebec /][/simnor_tab][simnor_tab label=”Ontario”]
[table id=E42-Vote-Ontario /][/simnor_tab][simnor_tab label=”Manitoba/Saskatchewan”]
[table id=E42-Vote-Mnsk /][/simnor_tab][simnor_tab label=”Alberta”]
[table id=E42-Vote-Alberta /][/simnor_tab][simnor_tab label=”British Columbia/Territories”]
[table id=E42-Vote-bc /][/simnor_tab]
[/simnor_tabs]

Each polling firm has weighted the data to ensure that the demographics are represented appropriately. I have used the weighted data for this calculation, and, as well, I have included “leaners” with the data, when the polling firm has done so.

What we see in this table is the relative error in the projected support is less than 4.05% in all cases. That’s pretty good – and means the projected support is fairly accurate compared to the aggregated polls. So, what we need to do is look at the actual riding-by-riding results, and determine where the prediction machine went wrong.

Overall, the Prediction Machine was accurate in 265 of 338 constituencies. That’s 78.4%. It’s pretty good; but let’s look a bit closer, and examine where things went wrong.

Because we’re dealing with 338 constituencies, or Electoral Districts, as Elections Canada likes to call them, it’s probably easier to break things down into groups. Above, I talked about Assured, Probable, Possible, and Long Shot. Starting with Assured, which are seats the Prediction Machine suggested would be virtually guaranteed to be won by the projected party, we see this:

Out of 109 possible assured predictions, the Prediction Machine was accurate 106 times. That’s 97.25% accuracy, which is an extremely good result. But it’s also not that exciting – because an assured result means the predicted winner is further ahead than anyone else by more than the poll’s (or aggregated polls’) margin of error for the region.

So, where did the Prediction Machine get it wrong?

[simnor_tabs][simnor_tab label=”Markham-Unionville”][table id=E42-Assured-35056 /][/simnor_tab][simnor_tab label=”Nickel Belt”][table id=E42-Assured-35069 /][/simnor_tab][simnor_tab label=”Toronto-Danforth”][table id=E42-Assured-35109 /][/simnor_tab]

[/simnor_tabs]

So, when it comes to the three ridings where the Prediction Machine missed an assured result, it came from one of two possible reasons – popularity of a specific candidate, or greater shifts in support that the polling firms didn’t pick up. That isn’t necessarily the fault of the polling firms, because things were moving very quickly; but what that says to me is momentum is very important in elections, and in this particular election, the Liberal party, unfortunately, caught some momentum and it snowballed, picking up NDP supporters along the way.

Now, we can look at the probable results. Probable, remember, is defined as the case where the second-placed candidate’s support is projected within the projected winner’s margin of error. Out of 229 probable candidates, the Prediction Machine got it wrong 70 times, which is an accuracy of 69.43%. That’s not too bad, but not great either. Certainly, it’s lower than I’d like it to be.

[table id=ELX42-Probable-All /]

Looking at the results where the Prediction Machine missed on the Probable winner, we see that, just like with the Assured results, turnout was, generally, higher than 2011, the Liberal support surged at the last minute at the cost of both the Conservatives and the NDP.

For completeness, I’ll include two more tables: One for ridings where not the Possible candidate didn’t win, and another where not even the Long Shot contender won. In that last circumstance, it represents where the Prediction Machine was completely wrong.

[table id=ELX42-Possible-All /]

[table id=ELX42-LongShot-All /]

So, what does it all mean?

Well, the bottom line is this – first, the Prediction Machine, I think, is probably as accurate as it can be. There may be a few tweaks to the prediction formula; and I’ll experiment with them in time. But overall, we are dealing with a program that tries to model an election from a less-than-perfectly-accurate survey, or aggregate of surveys, of the population.

It’s a prediction based on a prediction.

80% accuracy, I have to say, is pretty good, considering the variables at play here, and, I think, in the majority of election campaigns, I think I would have probably called the result pretty accurately. Most predictions were for a Liberal minority as well, so, while I was wrong in calling the size of the Parliament, I think the actual results I put forth were quite accurate.

So, I think I can safely say that my Prediction Machine will call the election with about 80% accuracy; meaning that 270 out of the 338 seats will be predicted correctly. That’s means that, out of my projected results, there are 2,678,521,876,251,498,576,491,365,815,949,827,268,919,378,444,242,535,468,031,460,203,168,021,780 possible ways 270 seats are accurately projected. That may seem like a ridiculously big number, but consider we’re dealing with cominatorial mathematics here.

It doesn’t really matter, however , how many possible outcomes there are with 270 accurately-predicted seats, only the one that the Prediction Machine kicks out is important; compared to the actual result from election day. In that context — and we’ll have to wait up to four years to see — I expect that I’ll be calling the election result within margin of error and with 80% accuracy again.

2015-10-23 Steven Britton My Stuff

Projection vs Reality and Analysis.

Leave a Reply Cancel reply