User:Lucasvb/An upgrade to the spatial model of voters: Difference between revisions
Content added Content deleted
Psephomancy (talk | contribs) (source formatting, interwiki links) |
|||
Line 122: | Line 122: | ||
We would expect that if both opinions are already similar, not a lot of convincing is required. We would also expect that the further apart the "opinion units" need to be relocated, the more difficult it is to change someone's opinion. |
We would expect that if both opinions are already similar, not a lot of convincing is required. We would also expect that the further apart the "opinion units" need to be relocated, the more difficult it is to change someone's opinion. |
||
In mathematics and engineering, this is a well-studied problem of [ |
In mathematics and engineering, this is a well-studied problem of [[w:Transportation theory (mathematics)|optimal transport]], and it has found uses everywhere, from artificial intelligence to traffic management. |
||
The intuitive notion of how "difficult" it is to convince someone to believe something else, piece by piece, is captured by the '''''[ |
The intuitive notion of how "difficult" it is to convince someone to believe something else, piece by piece, is captured by the '''''[[w:earth-mover's distance|earth-mover's distance]]''''' ('''''EMD''''') between two distributions. It is, intuitively, the least amount of effort you would need to rearrange one pile of dirt into another pile of dirt. |
||
If you replace "dirt" with "opinion unit", you'll immediately arrive at our idea here. |
If you replace "dirt" with "opinion unit", you'll immediately arrive at our idea here. |
||
Line 160: | Line 160: | ||
Note that different issues are never compared with one another here. Only opinions on the same issue count towards each term. |
Note that different issues are never compared with one another here. Only opinions on the same issue count towards each term. |
||
It should be clear that that if the distributions are infinitely sharp (i.e. [ |
It should be clear that that if the distributions are infinitely sharp (i.e. [[w:Dirac delta function|Dirac deltas]]), the earth-mover's distance is simply the distance between those two sharp peaks. In this way, we recover the old traditional spatial model from our model. |
||
=== Why the Euclidean distance anyway? === |
=== Why the Euclidean distance anyway? === |
||
We could have considered other distances (or "metrics") in the same way, but why single out the |
We could have considered other distances (or "metrics") in the same way, but why single out the Euclidean one? Why not use |
||
:<math>d(a,b) = \sum_{i=1}^{N} \text{EMD}(a_i(x),b_i(x))), </math> |
:<math>d(a,b) = \sum_{i=1}^{N} \text{EMD}(a_i(x),b_i(x))), </math> |
||
Line 198: | Line 198: | ||
The following Python code generates the trapezoid distribution for a given opinion, with belief from -1 to +1, and importance from 0 to 1. |
The following Python code generates the trapezoid distribution for a given opinion, with belief from -1 to +1, and importance from 0 to 1. |
||
<syntaxhighlight lang="python"> |
|||
import numpy as np |
|||
from scipy.stats import wasserstein_distance |
|||
L = 5 # bin resolution (number of degrees of agreement/disagreement) |
|||
W = 2*L+1 # total number of bins, an odd number so we have a clean zero |
|||
_space = np.linspace(-1,1,W) # array with positions of the bins |
|||
# Earth-mover's distance (or Wasserstein distance) between two opinion distributions |
|||
def EMD(dist1,dist2): |
|||
def EMD(dist1,dist2): |
|||
return wasserstein_distance(_space,_space,dist1,dist2) |
|||
# Generates an opinion distribution as a truncated & bounded trapezoidal distribution |
|||
# belief: from -1 to +1 |
|||
# belief: from -1 to +1 |
|||
# importance: from 0 to 1 |
|||
def opinion(belief, importance): |
|||
w = (1 - importance)*(W-1) + 1 |
|||
v = (w+1)/2 - L*abs(np.linspace(-1,1,W) - belief*importance) |
|||
v[v < 0] = 0 |
|||
v[v < 0] = 0 |
|||
v[v > 1] = 1 |
|||
v /= sum(v) |
|||
return v |
|||
# Visualize distribution with text blocks |
|||
def diststr(op): |
|||
return "".join(["_▁▂▃▄▅▆▇█"[int(r/max(op)*7)] for r in op]) # we go up to 7 because the full block looks bad piled up |
|||
# Show some distributions generated |
|||
⚫ | |||
for w in range(1,W+1): |
|||
c = 1 - (w-1)/(W-1) |
|||
for x in range(-L,L+1): |
|||
⚫ | |||
print("(%+0.03f|%0.03f)" % (x/L,c), diststr(op)) |
|||
# Print a few distances |
|||
for i in range(10): |
|||
⚫ | |||
⚫ | |||
b1, b2 = np.random.rand()*2-1, np.random.rand()*2-1 |
|||
⚫ | |||
⚫ | |||
o1 = opinion(b1, i1) |
|||
o2 = opinion(b2, i2) |
|||
print( |
|||
"(%+0.03f|%0.03f)" % (b1,i1), diststr(o1), |
|||
"vs", |
|||
"vs", |
|||
diststr(o2), "(%+0.03f|%0.03f)" % (b2,i2), |
|||
"= %0.04f" % EMD(o1,o2)) |
|||
</syntaxhighlight> |