Kemeny–Young method

Wikipedia has an article on:

Kemeny–Young method

Each possible complete ranking of the candidates is given a "distance" score. For each pair of candidates, find the number of ballots that order them the opposite way as the given ranking. The distance is the sum across all such pairs. The ranking with the least distance wins.

The winning candidate is the top candidate in the winning ranking.

Statistical interpretation

The Kemeny-Young method produces the maximum likelihood estimate for a voting model where the voters know a best order of the candidates, and for each pair of candidates, ranks that pair correct with some probability $p>{\frac {1}{2}}$ , or reversed with probability $1-p$ .^[1] This is called a Mallows model.^[2]

Strategic vulnerability

Kemeny-Young is vulnerable to compromising, burying, and crowding. It fails clone independence because adding a clone can cause a non-clone to be elected, and this effect increases as the number of clones increases.

Example

Tennessee's four cities are spread throughout the state

Imagine that Tennessee is having an election on the location of its capital. The population of Tennessee is concentrated around its four major cities, which are spread throughout the state. For this example, suppose that the entire electorate lives in these four cities, and that everyone wants to live as near the capital as possible.

The candidates for the capital are:

Memphis, the state's largest city, with 42% of the voters, but located far from the other cities
Nashville, with 26% of the voters, near the center of Tennessee
Knoxville, with 17% of the voters
Chattanooga, with 15% of the voters

The preferences of the voters would be divided like this:

42% of voters (close to Memphis)	26% of voters (close to Nashville)	15% of voters (close to Chattanooga)	17% of voters (close to Knoxville)
Memphis Nashville Chattanooga Knoxville	Nashville Chattanooga Knoxville Memphis	Chattanooga Knoxville Nashville Memphis	Knoxville Chattanooga Nashville Memphis

Consider the ranking Nashville>Chattanooga>Knoxville>Memphis. This ranking contains 6 orderings of pairs of candidates:

Nashville>Chattanooga, for which 32% of the voters disagree.
Nashville>Knoxville, for which 32% of the voters disagree.
Nashville>Memphis, for which 42% of the voters disagree.
Chattanooga>Knoxville, for which 17% of the voters disagree.
Chattanooga>Memphis, for which 42% of the voters disagree.
Knoxville>Memphis, for which 42% of the voters disagree.

The distance score for this ranking is 32+32+42+17+42+42=207.

It can be shown that this ranking is the one with the lowest distance score (this is because this is the Condorcet ranking, and therefore switching any pair of candidates would require overturning the majority of voters in that pairing rather than the minority). Therefore, the winning ranking is Nashville>Chattanooga>Knoxville>Memphis, and so the winning candidate is Nashville.

Example with a Condorcet cycle

25 A>B>C
40 B>C>A
35 C>A>B

A>B: 60>40, B>C: 65>35, C>A:75>25. There are 6 main rankings to consider here:

A>B>C: A>B opposed by 40, A>C by 75, and B>C by 35. Score is 150. So the minimum score so far is 150.
A>C>B: A>C by 75, A>B by 40, C>B by 65. Score is 180. Since this is greater than the minimum (150) this is disqualified.
B>A>C: B>A by 60, B>C by 35, A>C by 75. Score is 170. Disqualified by 150.
B>C>A: B>C by 35, B>A by 60, C>A by 25. Score is 120. This is the new minimum, so A>B>C is now disqualified.
C>A>B: C>A by 25, C>B by 65, A>B by 40. Score is 130. Disqualified by 120.
C>B>A: C>B by 65, C>A by 25, B>A by 60. Score is 150. Disqualified by 120.

So the final ranking is B>C>A, with B winning.

Approximate methods

Some other voting methods have the property that the social order they return has a Kemeny distance at most k times the optimum (the one that Kemeny finds).^[3] They do not necessarily pass the same criteria as Kemeny but can be interesting methods in their own right. Some results are:

The Borda count is a 5-approximation.
The footrule method is a 2-approximation.
The "best fit" method, which returns the ranked ballot with the least Kemeny distance, is a 2-approximation.
The random Kwiksort method ^[4] and its deterministic version CC-Pivot^[5] are both 3-approximations.
Choosing the best order of best fit and Kwiksort is a 6/5-approximation.^[4]

Notes

If, when the distance score of a ranking A>B>C is being calculated, a voter who ranked B but not A is treated as ordering A and B the opposite way as the ranking, then the Kemeny-Young ranking is a Smith set ranking. This is because any candidate in the n-th Smith set will always be ranked higher than any candidate in a lower Smith set by more voters than vice versa by definition (because the n-th Smith set candidate pairwise beats all candidates in lower Smith sets), so if you take any non-Smith set ranking and minimally modify it to become a Smith set ranking, this will always reduce the distance score. In other words, if there is some ranking which puts a candidate in the n-th Smith set after some candidate in a lower Smith set, then modifying it to swap the two will reduce the distance created by that pair of candidates.

See Pairwise sorted methods, which do what is called "Local Kemenization" to produce a ranking, while being cloneproof.

External links

Some text of this article is derived with permission from Electoral Methods: Single Winner.

References

↑ Young, Peyton (1995-02-01). "Optimal Voting Rules". Journal of Economic Perspectives. American Economic Association. 9 (1): 51–64. doi:10.1257/jep.9.1.51. ISSN 0895-3309.
↑ Tang, Wenpin (2018-08-26). "Mallows Ranking Models: Maximum Likelihood Estimate and Regeneration". arXiv:1808.08507 [math.ST].
↑ Claire, Mathieu; Simon, Mauras. How to aggregate Top-lists: Approximation algorithms via scores and average ranks. Proceedings of the 2020 ACM-SIAM Symposium on Discrete Algorithms (SODA). pp. 2810–2822. doi:10.1137/1.9781611975994.171.
↑ ^a ^b Ailon, Nir; Charikar, Moses; Newman, Alantha (2008). "Aggregating Inconsistent Information: Ranking and Clustering". 55 (5). New York, NY, USA: Association for Computing Machinery. doi:10.1145/1411509.1411513. ISSN 0004-5411. Cite journal requires |journal= (help)
↑ Zuylen, Anke van; Williamson, David P. (2009). "Deterministic Pivoting Algorithms for Constrained Ranking and Clustering Problems". Mathematics of Operations Research. INFORMS. 34 (3): 594–620. eISSN 1526-5471. ISSN 0364-765X. JSTOR 40538434. Retrieved 2022-04-25.

[Young_pp._51–64-1] Young, Peyton (1995-02-01). "Optimal Voting Rules". Journal of Economic Perspectives. American Economic Association. 9 (1): 51–64. doi:10.1257/jep.9.1.51. ISSN 0895-3309.

[Tang_2018-2] Tang, Wenpin (2018-08-26). "Mallows Ranking Models: Maximum Likelihood Estimate and Regeneration". arXiv:1808.08507 [math.ST].

[3] Claire, Mathieu; Simon, Mauras. How to aggregate Top-lists: Approximation algorithms via scores and average ranks. Proceedings of the 2020 ACM-SIAM Symposium on Discrete Algorithms (SODA). pp. 2810–2822. doi:10.1137/1.9781611975994.171.

[aggregate-4] Ailon, Nir; Charikar, Moses; Newman, Alantha (2008). "Aggregating Inconsistent Information: Ranking and Clustering". 55 (5). New York, NY, USA: Association for Computing Machinery. doi:10.1145/1411509.1411513. ISSN 0004-5411. Cite journal requires |journal= (help)

[5] Zuylen, Anke van; Williamson, David P. (2009). "Deterministic Pivoting Algorithms for Constrained Ranking and Clustering Problems". Mathematics of Operations Research. INFORMS. 34 (3): 594–620. eISSN 1526-5471. ISSN 0364-765X. JSTOR 40538434. Retrieved 2022-04-25.

[1]

[2]

[3]

[4]

[5]