User:Lucasvb/An upgrade to the spatial model of voters: Difference between revisions
User:Lucasvb/An upgrade to the spatial model of voters (view source)
Revision as of 22:08, 28 July 2020
, 3 years ago→An operational way to define opinions on issues
(13 intermediate revisions by 2 users not shown) | |||
Line 1:
= An upgrade to the spatial model of voters =
In the following article, I'll explain
I believe this model encapsulates many important aspects that have been missing from most analyses so far. But remember, it's still just a ''model''.
Line 13:
Now suppose you go around and ask people their opinion regarding many such issues, under those specific framings, and that people answered you honestly and accurately to the best of their knowledge. (We can't really expect more than that, as we can't read people's minds.)
At the very least, we would like to know whether they agree or disagree with the statement. But we could also ask how strongly they feel about that position, how certain they are of it, and how important they feel it is to hold that position.
One way to convert this into a numerical scale is by considering the following
* '''Belief''': we create a scale from "completely disagree" (-1) to "completely agree" (+1).
* '''Importance''': we create a scale from "completely indifferent" (0%) to "very important" (100%).
We can group
Everyone answering this quiz gets assigned an opinion on every one of these issues.
Line 30:
This kind of model has been used extensively in political polls for decades. The popular website [https://isidewith.com/ I Side With] uses a very similar model.
Of course, there's the question of how can we treat similar answers as compatible. It is possible to formally justify this, but that's a much deeper discussion. Since this is just a justification of a mathematical model for simulations, we don't really need to worry about it too much.
==== Stances ====
Line 120 ⟶ 122:
We would expect that if both opinions are already similar, not a lot of convincing is required. We would also expect that the further apart the "opinion units" need to be relocated, the more difficult it is to change someone's opinion.
In mathematics and engineering, this is a well-studied problem of [
The intuitive notion of how "difficult" it is to convince someone to believe something else, piece by piece, is captured by the '''''[
If you replace "dirt" with "opinion unit", you'll immediately arrive at our idea here.
Line 137 ⟶ 139:
* The distance is symmetric and unbiased. It takes the same amount of effort to change one distribution into the other, and vice versa.
:'''Note on terminology''': The name "importance" is mostly motivated by how the width of the distribution results in more willingness to compromise/make sacrifices, and determines a certain "zone of comfort" for the voter. Perhaps "importance" here should be instead interpreted as "certainty", which makes more sense given a "width of belief". Importance could then be included as a third parameter, maybe a scaling factor for each axis, changing the EMD by a factor. But it seems weird to say someone "completely agrees" but is 50% certain of it. Regardless of what we call it, the "width of the distribution" seems like a good approach.
== Comparing stances ==
Line 156 ⟶ 160:
Note that different issues are never compared with one another here. Only opinions on the same issue count towards each term.
It should be clear that that if the distributions are infinitely sharp (i.e. [
=== Why the Euclidean distance anyway? ===
We could have considered other distances (or "metrics") in the same way, but why single out the
:<math>d(a,b) = \sum_{i=1}^{N} \text{EMD}(a_i(x),b_i(x))), </math>
Line 170 ⟶ 174:
== Benefits & remarks ==
One major benefit of this approach is that we now have a direct way to embed mutual importance into our model of voters, as well as a notion of "fuzziness" to the opinions. The width of the distribution can be used as a proxy for a voter's willingness to compromise or make sacrifices on an issue.
A low-importance opinion is a wider distribution, which means it has a smaller distance to other opinions than a sharp one. So a voter with a low-importance on an issue effectively sees that axis as "compressed", that is, distances are shorter along that axis. On the other hand, if a voter has a high importance on an issue they will perceive differences more aggressively, making them see that axis as "stretched", that is, the distances are perceived as larger.
In this way, each voter has their own perception of how important each issue is, and this is accounted for when computing the distance between different stances. This model of distance also naturally captures the correlations between multiple issues due to this scaling, and the effect of voters and candidates giving different, incompatible importance to issues. (A simple scaling factor wouldn't capture this, as it would be agnostic to the target voter/candidate, so there would be no degree of correlation due to opinion compatibility. But a scaling factor on top of the distributions would add an even greater degree of flexibility.)
With the Euclidean distance, and how we embedded the different priorities voters have on multiple issues in our model, we now have a unified model which can naturally deal with voters having strong ideals, degrees of compromising, etc. We could even model the dynamics of voters by using the notion of "effort to move around opinion units".
Note that there's still a distance between someone who is indifferent and anyone with
The earth-mover's distance obeys many nice properties which preserve important features we want in this space of opinions, like a notion of partial orders which is required to rank compatibility across an issue in a consistent way, compatible with the overall geometric structure in multiple issues.
Line 184 ⟶ 188:
== Implementation ==
In practice, it's up to us to determine how sharply-peaked the distributions can get.
The earth-mover's distance is simple and efficient to compute in a discrete case, where the distribution is defined in a number of bins. This makes it readily available in many software packages.▼
As a first approximation, it is helpful to model the distribution as a simple trapezoidal distribution, instead of a normal distribution.▼
▲The earth-mover's distance is simple and efficient to compute in a discrete 1D case, where the distribution is defined in a number of bins. This makes it readily available in many software packages.
▲
In my simulations, I've defined an integer parameter <tt>L</tt>, the resolution of one side of the belief axis. In order to make 0 a valid belief, it is best to use an odd number of bins, so the total number of bins is given by <tt>W = 2*L+1</tt>.
Line 191 ⟶ 198:
The following Python code generates the trapezoid distribution for a given opinion, with belief from -1 to +1, and importance from 0 to 1.
<syntaxhighlight lang="python">
def
return wasserstein_distance(_space,_space,dist1,dist2)
def
v = (w+1)/2 - L*abs(np.linspace(-1,1,W) - belief*importance)
v /= sum(v)
for w in range(1,W+1):▼
for
b1, b2 = np.random.rand()*2-1, np.random.rand()*2-1▼
▲ o1 = opinion(b1, i1)
o2 = opinion(b2, i2)
"vs",
</syntaxhighlight>
|