Which voting system is best?

Tony Crilly

With the day of the referendum on the UK voting system drawing nearer, Tony Crilly uses a toy example to compare the first past the post, AV and Condorcet voting systems, and revisits a famous mathematical theorem which shows that there is nothing obvious about voting.

Choosing the winner

How to choose a winner?

The good folk of Chuddlehampton have to elect their village councillor, the person to represent them at the Regional Council. "Go and elect someone" was the only instruction they were given. So how were they to go about it?

The village elders supervising the vote soon found there were many ways this could be done. It was bewildering, but they had to find a suitable way to do this. There were three candidates, A, B, C, shorthand for Squire Allsworthy, Madame Bosanquet, and Farmer Charles. The elders wanted to avoid such crude methods as drawing straws. That might have sufficed in years gone by, but modern Chuddlehamptoners wanted to show the world that they were a sophisticated lot.

It seemed there were three main methods available:

  • The first-past-the-post (FPTP) method
  • The alternative vote (AV) method
  • The Condorcet method.

Having studied the matter, the elders could not decide which one to choose. In the end they decided to hold the village vote and decide afterwards. If things went well perhaps all methods would deliver the same result, and a choice would prove unnecessary.

So the electorate of twenty villagers assembled at the voting station and filled out their voting slips. The village clerk explained that each should list the candidates in order of preference, in what he called a voter profile. So a voter profile CAB meant that voter had put down C as first preference, A as second preference and B as third preference.

Corresponding to the twenty voters in the village there were twenty profiles. When the clerk gathered them in, ready for the count, he noticed a pattern. This did not surprise him unduly as Chuddlehamptoners tended to be a clannish bunch.

There were 4 voter profiles of the type ABC, 6 profiles BAC, 6 profiles CAB, 2 profiles CBA and idiosyncratic voters who returned profiles ACB and BCA. This made twenty and to the credit of Chuddlehampton there were no spoilt votes. In a table the results were:

No of votersProfiles

We'll see how the candidates fared under each of the systems:

The FPTP method

Looking only at first preferences, as you are only allowed to do in the FPTP method, the clerk counted 8 votes for C, 7 votes for B, and 5 votes for A.

Farmer Charles gained the most votes so should be elected.

The AV method

Although Farmer Charles obtained 8 first preference votes (40% of the vote) he failed to get 50% which is required for a win under the AV system. As Squire Allsworthy got the fewest first preference votes (25%) he is dropped off the list and the vote goes into a next round of counting, with a run-off between Madame Bosanquet and Farmer Charles. In this round the second preferences for the voters who gave Squire Allsworthy their first preference are counted. So, as he is no longer on the list and cannot be elected himself, his voters' second preferences will count in the run-off.

The voters giving Squire Allsworthy the first preference gave the 4 profiles ABC and the single profile ACB. So there are 4 second preferences to be awarded to B and one to C. Added to the first preferences, candidate B has 7 + 4 = 11 votes, and C has 8 + 1 = 9 votes.

So Madame Bosanquet would be elected.

The Condorcet method

This method is perhaps lesser known, but it is one which is quite transparent and fair.

Marquis de Condorcet (1743-­‐1794)

Marquis de Condorcet (1743-1794)

The Marquis de Condorcet was a key player in the examination of voting systems when the French Revolution was in full flood. He was an eminent mathematician of the time who corresponded with the likes of Leonhard Euler on mathematics. In his later years, before he met his death mysteriously in a prison near Paris, he applied mathematics to social situations, like juries and voting systems.

Condorcet's method takes in the fine detail of the voter profiles by examining preferences between pairs of candidates. So, for example, the voter profile CAB expresses a preference for C over A, A over B, and C over B. These are coded CA, AB, and CB. Other voters may prefer pairs in the reverse orders AC, BA, BC. There are altogether six pairs to consider.

By analysing all the profiles, pitting each candidate against the other two, the clerk found that the electorate of Chuddlehampton preferred A to B (11 votes have preference AB and only 9 have preference BA), A to C (again by 11 votes to 9), and B to C (again 11 to 9).

As the electorate prefers A to both B and C, Squire Allsworthy is deemed the winner.

So the three schemes, FPTP (Farmer Charles), AV (Madame Bosanquet), and Condorcet's method (Squire Allsworthy), give three different winners.

There are drawbacks to each of these systems. Under FPTP, Farmer Charles was elected with 8 votes but 12 voters did not support him with their first preferences. Under AV, Madame Bosanquet won the election but relied on the second preferences of those who voted for candidate A as their first preference. This result upturned the first preference vote which by a majority of first preference votes placed C above B. A potential problem with the Condorcet method is that the vote may be inconclusive. It is quite possible to arrive at the result of an election such as AB, BC, CA where no one candidate is preferred over the others (this is known as the Condorcet paradox).

So there are weaknesses with each of the three voting systems, underlining the fact that there is nothing obvious about voting systems.

Kenneth Arrow

Kenneth Arrow (b.1921). Image: Linda A. Cicero / Stanford News Service.

When voting systems come under discussion, mathematicians think of Kenneth Arrow's landmark theorem proved in the 1950s. In recognition of his work Arrow was awarded a Nobel Prize for Economics in 1972.

A voting system is a way of translating the individual voters' preferences into preferences for the whole constituency. Arrow set up a list of requirements thought desirable for a voting system in general, and the type of voting which should take place. Like the folk of Chuddlehampton each voter in an Arrow system records a preference by ranking all the candidates in the election. The result of the election is a ranking of the candidates based on the preferences of all the voters.

Next Arrow assumed some mathematical properties of his system. He stipulated that a difficulty found in Condorcet's method should not be allowed to happen for his abstract system: in his ideal system he could not allow AB, BC, CA as the result of an election, for it was no result at all. It says that A is preferred to B, B to C, and C to A, going around in a circle. This contradicts the transitive law that AB and BC always implies AC.

In much of mainstream mathematics the transitive law is obeyed. For example, in the set of the integers with the greater than relation we have that a > b and b > c always implies that a > c. It needs to be stated afresh for voting systems, for transitivity is not obviously the case for some human relations - in the relation "is a friend of" does it automatically follow from a is a friend of b, and b is a friend of c, that a is a friend of c?

Arrow also makes the assumption that the removal of one candidate does not alter the preferences of other candidates, as occurred in the AV result at Chuddlehampton. But he would not like FPTP either, because in this system voters only mark their first preference. There are several other modest properties that Arrow assumes of his general voting system. Of course it must have the transitive property (thus making the system "rational"). Summing up,

  • There should be no single voter who could force the result of the election (dictator condition).
  • If every voter puts Candidate X above Y in their voter profiles, the result of the election should do the same (unanimity condition).
  • The result of an election placing X above Y is unaffected by the ranking of some other individual (independence from irrelevant alternatives).

So Arrow set up an abstract system with a set of axioms, much like we might set up the definition of an object in abstract algebra.

Arrow's theorem states that no voting system can conform to all the axioms. By deduction Arrow proved a contradiction to one of the axioms must be found. Put another way, whatever voting system is chosen, one of the axioms must be violated. It is an impossibility theorem.

Arrow's theorem is one of those negative results beloved by mathematicians (like there being no integer solution to Fermat's equation for cube powers or higher, or the impossibility of constructing a square with the same area as a given circle using only a ruler and compass). It sets limits on what is possible for voting systems.

Strictly speaking, Arrow's theorem only applies to voting systems in which each voter ranks all the candidates. In the FPTP system voters place an X against their first preference and to no other, so Arrow's theorem says nothing about it (like Pythagoras' theorem which makes a statement about right angled triangles but not about other triangles). In the AV system as it is intended to be used in UK elections, voters place candidates in a preferred order but they are under no obligation to rank them all. Arrow's theorem does not apply to this version of AV either.

But this doesn't make the theorem irrelevant. Arrow's approach highlights the potential tension between the fundamental assumptions an ideal voting system should satisfy. It allows us to explore how they might be modified to give acceptable voting systems and to explore the theoretical nature of these systems. Condorcet and Arrow show us that voting systems are more complex that we might think, and they have shown us how to analyse voting systems in a mathematical way.

You can read more about the mathematics of voting, including more about Arrow's theorem, in these Plus articles.

About the author

Tony Crilly is Emeritus Reader in Mathematical Sciences at Middlesex University, having previously taught at the University of Michigan, the City University in Hong Kong and the Open University. His principal research interest is the history of mathematics, and he has written and edited many works on fractals, chaos and computing. He is the author of an acclaimed biography of the English mathematician Arthur Cayley and the internationally bestselling 50 Mathematical Ideas You Really Need to Know. His latest book, The big questions: Mathematics, is reviewed in Plus.


It is shocking that Britain is going to vote for FPTP which has led to so many evil results down the years, including 9% winners in Papuan elections. However compulsory voting is a more important reform. Without compulsory voting there is enormous incentive for elections to be run in such a way that particular sections of the community are less likely to vote. With compulsory voting we find, certainly in Australia, that the authorities are obliged to make it convenient for everyone to vote.

Robert Smart, Sydney

In Argentina voting is also compulsory, and in my opinion it has disastrous consequences, especially when voting for individual candidates rather than parties. Compulsory voting means that people who are ignorant or apathetic will vote, leading them to just vote for the prettiest face. Might as well draw lots...

Thanks for writing up this great article, Tony. While I knew voting systems had flaws, I did not know the details and I was unaware of Arrow's work. Your effort helped me easily understand the details of the different systems, and the background on Arrow's work was intriguing.

This result upturned the first preference vote which by a majority of first preference votes placed C above B.

The first preference votes do place C above B, but not by a majority; only 8 out of 20 people gave their first preference to C, but a majority means more than half - at least 11 out of 20, in this case.

If C did have a majority of first-preference votes then all reasonable election systems would declare him the winner, including the three discussed here.

It's also worth noting that in real life, people know which system is being used and can vote accordingly; there may be incentives to fill out your ballot paper in a way which doesn't represent your true preferences.

If there were to be a system where there were no constituencies but all the votes were to be added up nationally, then everybody's vote would have some affect on the result of the election, but nobody's vote would change the result of the election.
In the UK when people vote they just vote for their preferred party. But when the results are added up nationally, the results don't match the vote. For example, in the 2010 election David cameron got 36.1% of the vote of the nation, but 47% of the seats in the House of Commons. This is just not fair.

That's one constituency, not zero constituencies.

Selection of a representative is multi-criteria decision analysis problem. The elector chooses on some weighted assessment of party manifesto and candidate's performance &c., in order to derive a voter choice for the representative most likely to deliver the "promise." It is, therefore, a popularity selection process and should be rightly regarded as such. The Borda Count delivers the candidate who is most popular with the electorate i.e. Meets most of the requirements of most of the electors. It also provides a voting solution with least paradoxes.

Condorcet's method is the correct method. The non-transitive "weakness" is not really there:
1. Even if it was possible for millions of votes and candidates to perfectly 'match up' It's good to know that instead of incorrectly using another system that, with the same votes, places one 90% over the other. Elections are not done in small numbers where you're just comparing 3 almost equal people with a small integer number of votes. If you asked "Do you prefer Hilary over candidate A?", who knows if 85% of the population would select candidate A?
2. You're going to get topological inconsistencies between every voter. It doesn't matter if voter 1 likes A -> B -> C then voter 2 likes C -> B -> A. You just add them up and get a total satisfaction score as stated. This is trivial.

The voting system in the States is not mathematically accurate because it isn't a pairwise vote. If I asked for a vote between only 3 options, chocolate, cheesecake, and strawberry:
Let's say
15% choc -> cheese -> straw
40% cheese -> choc -> straw
45% straw -> choc -> cheese
So it seems like only 15% wanted chocolate and the largest group voted for strawberry. However, compare them pairwise using all information:
55% prefer cheesecake over strawberry, so strawberry already shouldn't win
But wait,
60% prefer choc over straw and
55% prefer choc over straw So actually, chocolate should win.

It makes perfect sense that Trump and Hilary would be the remaining candidates because they represent the largest groups of people that give them their first vote, but that doesn't take into account the remaining population's preference. Someone could be preferred first pairwise and not even get a single first vote. This is basic graph theory and the everyday person doesn't catch on to it.

The only positive thing that could be imagined from going with everyone's first votes only would be a system where you need that. For instance, "guys, one of us is going to take a team to accomplish a mission, but the team needs to fully believe in the leader" Then, even though the winner does not take the pairwise win, he has a large group of people who would choose him first, but even that situation is completely imaginary. In real life, I can't think of an actual situation where you'd do that in place of a pairwise (AKA Condorcet's) vote.