Talk:Space of possible elections

From electowiki

Some remarks

I couldn't find this analysis done anywhere else, and since I could use some combinatorics exercises I decided to give it a shot. As far as I know this is accurate, but corrections are welcome.

One thing I'd like to highlight, given this analysis, is that the impartial culture model of simply drawing ballots at random does not seem to be good enough to sample the space of possible elections uniformly.

The exchange symmetry between voters and the fact most possible partitions of voters into identical-ballot groups is highly non-uniform means the typical impartial culture model will sample a region of this space which is exceptionally void of interesting structure. If you have n voters, the vast majority of partitions will divide voters into 0.2n - 0.4n distinct groups (to be precise, the number of groups will follow a Gumbel distribution).

The interesting structure is in the lower end of the partitions. So, for 100 voters, if voter partitions are drawn at random most elections will divide the voters into 20-40 unique ballot groups. In contrast, the typical impartial culture model would create as many groups as possible and all with roughly equal number of voters, focusing on the higher end of this distribution of number of groups.

A sampling model which takes the distribution of partitions into account would more adequately sample this space, and biasing towards smaller partition sizes (due to voter and candidate correlations) would be more likely to reproduce results useful for estimating real-life elections. lucasvb (talk} 21:26, 24 July 2020 (UTC)

Speeding up computation

This is a pretty broad section, but I'll mention specifically how you can reduce the number of valid ballots to consider. We discussed earlier that scored ballots can be boiled down to their margins i.e. A:5 B:3 C:3 is equivalent to A:2 B:0 C:0. So that could greatly reduce the number of cardinal ballots to consider (unless you're dealing with write-in candidates). Something to note is that if you assume normalization, then this trick no longer works i.e. you can't raise or lower the scores of all candidates by the same amount because at least one candidate is at the max score, and at least one other at the min score. The exception is if all candidates are scored the same, since A:5 B:5 C:5 and A:4 B:4 C:4 are both, in some sense, normalized yet equivalent.

Later on I might add some content or links to this Talk page describing how to figure out the number of possible rated pairwise preference ballots there are, though it depends on what kind of transitivity rules you impose on the system. BetterVotingAdvocacy (talk) 21:15, 24 July 2020 (UTC)

Yes, this is a good point, although the applicability (whether the ballots are equivalent) really depends on the system in question. I'd expect many systems to produce additional constraints in the number of elections. It is also not difficult to estimate the number of ballots with normalization, I might add that in later. lucasvb (talk} 21:29, 24 July 2020 (UTC)
Regarding number of ballots with normalization, I think it can be calculated in the following manner. For c candidates and r possible scores, we can think of it as looking for every possible pair of candidates who are to be min/maxed, multiply this by 2 (since either the first candidate in the pair is min'ed and the second max'ed, or vice versa), and then for the remaining candidates not in the selected pair, they can be scored in any permutation of ways, so they are r^(c-2). So this altogether is: 2 * (c choose 2) * r^(c-2). BetterVotingAdvocacy (talk) 23:32, 25 July 2020 (UTC)
That seems accurate if the normalize is done by the voter. If the normalization is done automatically it would be different, as you'd get rational numbers for the votes. In that case one must consider all possible intervals between the min/max voters and all combinations of scores within the interval. Then we must also be careful to remove degenerate cases. For 0-5 ratings (6 levels) we get 3, 61, 579, 4381, 30243 for c=2,3,4,5,6..., for 0-9 (10 levels) 3, 169, 2883, 38041... Closed form seems a bit trickier than I thought, but doable. lucasvb (talk} 00:06, 26 July 2020 (UTC)