Limitations of spatial models of voting: Difference between revisions

Fixing link to Spatial models of voting after rename.
No edit summary
(Fixing link to Spatial models of voting after rename.)
 
(16 intermediate revisions by 2 users not shown)
Line 1:
"[[Spatial models of voting]]" are ubiquitous in theoretical study and simulations of voting methods. This article describes many '''limitations of spatial models of voting'''.
{{rename|from=Dimensional limitations of the spatial model|to=Limitations of the spatial model|date=January 2023}}
 
[[SpatialIn model of voting|Spatialspatial models]] are ubiquitous in theoretical study and simulations of voting methods. In this model of agent behavior, agents (e.g. voters, candidates) are placed in an abstract geometric space, usually Euclidean, in which each dimension denotes some ideological alignment or opinion on an issue. The behavior of agents is modeled by how "close" (under some appropriate metric) they are to other agents in this space. In the context of voting, voters are modelled as ranking candidates depending on their proximity to each candidate within this space.
 
However, models based too strictly on geometric representations have challenges representing both voters and candidates. This article describes some of the challenges.
However, the number of dimensions chosen for this geometric embedding imposes fundamental restrictions on the allowed number of candidates which may be effectively distinguished by the voters using ballots, as there is only a finite number of regions possible for each possible ranking assignment of candidates.
 
== Number of dimensions ==
The following article discusses this limitation and some implications. The specific numerical results below assume an Euclidean space and Euclidean distances, but similar qualitative arguments apply to any spatial model and chosen metric.
However, theThe number of dimensions chosen for this geometric embedding imposes fundamental restrictions on the allowed number of candidates. which There is a limited number of dimensions that may be effectively distinguished by the voters using ballots, as there is only a finite number of regions possible for each possible ranking assignment of candidates. Conversely, an insufficient number of candidates in a ballot (either by a small number of candidates or arbitrarily restricting the ballot) will also fundamentally restrict the effective opinion space voters can express, as the effective dimensionality is inherently reduced.
 
The following article discusses this limitation and some implications. The specific numerical results below assume an Euclidean space and Euclidean distances, but similar qualitative arguments apply to any spatial model and chosen metric, as well as the actual real-life behavior of voters (although quantifying it is impossible).
 
==How many ballots could voters ''actually'' cast?==
 
With <math>n</math> candidates in an election, be it rated or ranked, there are <math>n!</math> possible rankings between candidates. These <math>n!</math> possible preferences indicate all the possible ''distinctions'' voters could ever possibly make between alternatives, (no matter ''how'' those distinctions are made).
 
TheseThis is true independently of any abstract mathematical model of reality of human behavior, as it is a constraint of the ballots themselves and their information content. In reality, these distinctions are based on some internal attributes and judgements voters have about the world and the candidates, and this is the information voters want ballots to convey. This is what voting methods attempt to ''represent'' from voters.
 
But due to several limitations, not not all these ballots can ''actually get cast'' in an election. In practice, we only observe a few preference orders, indicating that there's a lot of correlation between voters and between candidates, or putting it in another way, that the "space" of attributes relevant in the election is smaller than the one expressible by the ballots. This is important to consider when developing a mathematical structure to abstractly discuss real voting methods and voter behavior.
 
In other words, while a spatial model attempts to reverse engineer real-life behaviors and construct a model of the information underlying an election, the ballots themselves, be it from real life elections or computer simulations, can only capture some of the information.
These distinctions are based on some internal attributes and judgements voters have, and this is the information voters want ballots to convey. This is what voting methods attempt to ''represent'' from voters.
 
== Mathematics of a spatial model ==
But due to several limitations, not not all these ballots can ''actually get cast''. In practice, we only observe a few preference orders, indicating that there's a lot of correlation between voters and between candidates, or putting it in another way, that the "space" of attributes relevant in the election is smaller. This is important to consider when developing a mathematical structure to abstractly discuss voting methods and voter behavior.
 
[[File:Maximum Voronoi regions 2D.svg|thumb|For d=2 dimensions and n=3 candidates (ABC), there is a region in the space for each of the 3! = 6 possible rankings between the candidates, so no information is lost: all possible opinion distributions and ballots can exist.]]
In a <math>d</math>-dimensional spatial model for voter behavior, in which voters judge candidates in terms of proximity using <math>d</math> separate attributes (no matter ''how'' such attributes are used), there is a fundamental mathematical limit for how many ballots can possibly occur, in any arbitrary distribution of voters and candidates. (Equalities or partial rankings do not matter in this analysis, as they can be included in the same space with minimal adjustment.)
 
[[File:Voronoi regions 2D 4 candidates.svg|thumb|With a fourth candidate there are 4! = 24 possible rankings, but it's impossible to partition the space (under Euclidean metric) into more than 18 regions, one example as shown here. Therefore, many of the rankings cannot occur under this 2-dimensional model, e.g., any ballot with D ranked last, in the image. For 3 dimensions, we can construct all of the 24 required regions for the ballots.]]
This restriction is less about the existence of an actual "Euclidean space of opinions" in the abstract (i.e. the accuracy of our chosen ''models''), but instead, about how candidates could ''ever'' be classified in terms of a finite set of attributes by voters. ''Any'' comparison voters are actually doing between any two candidates must occur in at least ''one'' attribute between them, that can be used to classify the voter's preference one way or another. This dimension <math>d</math> quantifies how many such attributes must exist in order for us to observe a given set of ballots. Thus, this is a very real and fundamental limitation of any realistic and operational description of voter behavior.
 
Note, however, that '''''opinion space''' is distinct from '''''ballot space'''. Opinion space is what contains the actual distribution of voters and candidates, and this may have any number of dimensions, voters and candidates. In contrast, ballot space is the [[Space_of_possible_elections|space of possible ballots that voters can cast]], which confines them to express their opinions in a particular way. One can think of ballot-casting as a function that takes a voter's opinion and that of the candidates (plus additional external factors), and produces a ballot: <math>\text{ballot} = f(\text{voter opinion}| \text{distribution of candidates}, \text{external factors})</math>. This article refers to the limitations of this function, that is, how much information about ''opinion space'' can in principle survive inside ''ballot space''.
 
ThisTo address this problem, a specific metric space has to be chosen. For the Euclidean case, this dimensional dependence was already addressed by Tideman in 1977<ref>[https://www.sciencedirect.com/science/article/pii/0097316577900772 Stirling numbers and a geometric, structure from voting theory (Good and Tideman, 1977)]</ref> (see also <ref>[https://link.springer.com/content/pdf/10.1007/s00454-001-0073-4.pdf Perpendicular Dissections of Space, Thomas Zaslavsky]</ref>, and the similar idea of Vapnik–Chervonenkis dimension<ref>[https://en.wikipedia.org/wiki/Vapnik%E2%80%93Chervonenkis_dimension Vapnik–Chervonenkis dimension]</ref>.). Unfortunately, this result and its fundamental implications to the field of voting theory have gone underappreciated.
 
With these mathematical results, it is possible to infer the minimum dimensions of any real life ranked election or ballot scenario, and maybe even infer whether enough candidate diversity was present. For <math>n</math> candidates and <math>d</math> dimensions, the following table shows the absolute maximum number of ballots that ''any'' distribution of voters and candidates could possibly generate if voters are using those <math>d</math> dimensions to classify the candidates.
Line 55 ⟶ 64:
==Dimensional resolution of a ballot ==
 
The table also informs us about the limitations of a voting method to really convey the information voters are using to classify the candidates. For a given number of candidates, after a given dimension any further dimensions will not add any extra resolution, as the ballot cannot express such information. This is why the rows in the table stop, as we would have maxed out the possible ballots.
 
The practical effect of this is effectively forcing each voter to "collapse" their ideological space to at most a certain number of dimensions (i.e. political issues), which is the dimension in which their ballots saturate. Furthermore, this "collapse" is entirely determined by the candidates themselves, not the voters, further enhancing the distortion of the information collected and lowering representativity.
 
To interpret this, we consult the table once more. If there are d=4 important issues voters are using to judge candidates, then we require ''at least'' 5 candidates to potentially allow voters to account for all possible political positions in an election. This is how when only n=2 candidates exist, any further dimension or attribute will not lead to more resolution than for d=1. In other words, there is a collapse of the entire ideological space in one dimension for each voter. This is, effectively, the problem of two-party domination and single-issue voting.
 
These observations also have important implications on specific voting methods. An IRV[[Instant-runoff Voting]] election limited to top-three rankings fundamentally limits what sort of ideological distributions can be conveyed, no matter how many candidates are running.
 
From the table above, we see that if every voter is forced to rank only 3 candidates, then every voter can only express information about at most two relevant issues in their ballot<ref>There's at least one extra dimension, because a voter has to classify which are the "top three" candidates, so there has to be a "line" separating these three candidates from everyone else.</ref>, as more issues cannot ever classify the 3 ranked candidates morefurther. Even if they are inherently ranking the candidates based on many other things, this information cannot fit into the ballot and information is fundamentally being lost. It is functionally equivalent to a scenario where voters are forced to use only two attributes to judge their candidates.
 
If ranked ballots are constrained to <math>k</math> out of <math>n</math> candidates, the population, as a whole, can only cast <math>\frac{n!}{(n-k)!}</math> ballots, which means the voting method "mixes" the information multiple voters expressed, as each voter is using a different subset of attributes in their ballots. Thus, there are no guarantees all the voters are expressing information about the same issues in their ballots, and the ballots cease to be informationally commensurable, even in principle. In effect, we are left to simply hope that their priorities are, on average, similar, as to restore commensurability.