Pairwise Least Squares: Difference between revisions

Revised significantly. Added implementation.
(Added a brief summary and one algorithm for applying least squares to a pairwise matrix.)
 
(Revised significantly. Added implementation.)
Line 1:
Pairwise least squares is a description for a method that assigns a score to each candidate based on voters' pairwise preferences.
 
=== Overview Summary===
 
Least squares estimation is a technique from linear algebra that finds the closest solution to a system of equations.
Line 7:
A pairwise listing of preferences is the number of people who prefer a candidate to another.
 
Applying least squares to pairwise preferences produces a number of votesscore for each candidate. A difference between any two candidates votes can be found. The differences are a projection of the preferences. Any preference loops are not represented in these differences. The square distance between these differences and the preferences is minimized.
 
The original preferences are now approximated by taking the difference between any two candidates' scores. The approximation is a projection in the space of pairwise preferences. The projection is the result of subtracting any cyclic group preference component. The projection minimizes the sum of squares of errors in preference numbers.
=== Math ===
 
These scores turn out to be essentially the same as borda scores.
Let <math>\boldsymbol{A}</math> be the incidence matrix with a row for each preference and a column for each candidate.
 
=== Abstraction ===
Let <math>\hat{\boldsymbol{x}}</math> be scores to be assigned to each candidate.
 
Let the election be modeled as a network.
Let <math>\boldsymbol{y}</math> be the number of voters with a preference.
 
Let '''x''' be a list of scores. Each node in the network is a candidate with a score.
The problem to solve is:
 
Let '''A''' be a list of preferences. Every pair of nodes is connected by two edges. The edges are directed and represent preference for one candidate over another. In network theory, this is an incidence matrix with a row for each edge and a column for each node. For each preference, put a 1 for the “for” candidate and a -1 for the “against” candidate. Leave the rest as 0.
:<math>\mathbf{A}\hat{\boldsymbol{x}} = \mathbf y.</math>
 
Let '''b''' be a tally of preferences. This is how many voters support a preference.
There are actually many solutions because the same score could be added to each candidate. So we set the score of the first candidate to 0 in order to solve for one solution.
 
The problem is to find scores '''x''' for some given tally of preferences '''b''':
:<math>\mathbf{A_0} = \mathbf{A[:,1:]}.</math>
 
: '''Ax ≟ b.'''
 
=== Approximation ===
 
Generally, the problem is unsolvable because there can be cyclic group preferences. Cyclic preferences don’t make sense if the preferences '''b''' are computed differences in score, '''Ax'''.
 
However, we can find a solution '''x̂''' to a transformed equation:
 
: '''Aᵀ A x̂ = Aᵀ b.'''
 
This new '''x̂''' is an approximation of scores that fits the network model..
 
=== Solution ===
 
There are actually many solutions because the same score could be added to each candidate. So we set the score of the first candidate to 0 in order to solve for one particular solution.
 
: '''A₀ = A[:,1:]'''
 
The solution is:
 
: '''x̂ = ( A₀ᵀ A₀ )⁻¹ A₀ᵀ b.'''
:<math>\hat{\boldsymbol{x}} = \left( \mathbf{A_0}^{\operatorname{T}} \mathbf{A_0} \right)^{-1} \mathbf{A_0}^{\operatorname{T}} \mathbf y.</math>
 
=== Projection ===
 
Multiplying by '''Aᵀ''' ignores any component of '''b''' that is in the null space of '''Aᵀ'''. The null space of '''Aᵀ''' is composed entirely of cyclic group preferences. This is '''Aᵀ''' is a list of edges connected to each node and the null space '''N(Aᵀ)''' is such that '''Aᵀ N(Aᵀ) = 0''', which is a statement of conservation of flow. Putting the solution to this transformed equation back into the original equation gives some transformed preference matrix '''p''' that has no cycles:
 
: '''A x̂ = p.'''
 
The closest we can get to preferences '''b''' using our score model is to project '''b''' into the space of possible score model preferences '''Ax''', which does not have any cyclic component. The projection can be represented by the matrix '''P'''. The below definition of '''P''' gives '''Pb = p'''.
 
: '''P = A₀ ( A₀ᵀ A₀ )⁻¹ A₀ᵀ'''
 
: '''Pb = A₀ ( A₀ᵀ A₀ )⁻¹ A₀ᵀ b'''
 
: '''Pb = A₀ ( A₀ᵀ A₀ )⁻¹ A₀ᵀ A₀ x̂'''
 
: '''Pb = A₀ x̂'''
 
: '''Pb = p'''
 
=== Least Square Error ===
 
The projected preferences '''p''' are the closest to the actual preferences '''b'''. In other words, projecting made the difference '''b - p''' as small as possible.
 
: '''p = argmin <sub>p ∈ A x̂</sub> || b - p ||'''
 
Let’s call this difference the error '''e.'''
 
: '''e = b – p'''
 
The error '''e''' is a vector. The length of a vector is the square root of the sum of squares of the components. This means projecting also minimizes the sum of the squares of the error for each preference.
 
: '''p = argmin <sub>p ∈ A x̂</sub> || e ||²'''
 
=== Borda ===
 
The matrix '''A₀ᵀ A₀''' is '''2 ( n I - 1 )'''. '''A₀ᵀ b''' is the sum of preferences for minus preferences against. So, the scores '''x̂''' are effectively the net borda scores '''A₀ᵀ b'''.
 
: '''A₀ᵀ b = 2 ( n I - 1 ) x̂'''
 
=== Implementation ===
 
<syntaxhighlight lang="python3">
 
from numpy import *
 
def net(n):
m = int(n * (n-1) )
A = mat(zeros((m,n))) # incidence matrix
e = mat(zeros((m,2))) # edge list
r = 0
for i in range(n):
for k in range(n):
if i == k:
continue
A[r,i] = 1
A[r,k] = -1
e[r] = mat([i,k])
r = r + 1
A0 = A[:,1:]
L0 = (A0.T * A0) ** -1 * A0.T
z = zeros([1,L0.shape[1]])
L = block([[z],[L0]])
return [e,A,L]
 
def edgeFromDense(d,e):
m = e.shape[0]
b = mat(zeros((m,1)))
for i in range(m):
k0 = int(e[i,0])
k1 = int(e[i,1])
b[i,0] = d[k0,k1]
return b
 
def denseFromEdge(e,b,n):
m = e.shape[0]
d = mat(zeros((n,n)))
for i in range(m):
k0 = int(e[i,0])
k1 = int(e[i,1])
d[k0,k1] = b[i,0]
return d
 
def solveNetwork(A,b,L):
x = L * b
p = A * x
return [x,p]
 
def solveNetDense(d):
n = d.shape[0]
[e,A,L] = net(n)
b = edgeFromDense(d,e)
[x,p] = solveNetwork(A,b,L)
pd = denseFromEdge(e,p,n)
pd2 = pd + (d + d.T) / 2
return [x,pd,pd2]
 
def test():
d = mat([[ 0., 5., 2., 5.],
[ 5., 0., 4., 7.],
[ 8., 6., 0., 10.],
[ 5., 3., 0., 0.]]
) # d = dense network graph
[x,pd,pd2] = solveNetDense(d)
print("Scores")
print(x)
print("Borda Scores")
print(sum(d,1))
print("Preferences")
print(p)
print("Projected Preferences")
print(pd)
print("Projected Preferences Plus Pairwise Loops")
print(pd2)
print("Higher Order (>2) Loops Removed")
print(d - pd2)
print("Borda Scores of Projected Preferences Plus Pairwise Loops")
print(sum(pd2,1))
 
test()
 
</syntaxhighlight>
 
=== External Links ===
 
* [https://www.youtube.com/watch?v=6-wh6yvk6uc&list=PL49CF3715CB9EF31D&index=12 MIT Linear Algebra - Lecture 12 - Graphs, Networks, Incidence Matrices]
* [https://www.youtube.com/watch?v=osh80YCg_GMY_Ac6KiQ1t0&list=PL49CF3715CB9EF31D&index=1615 MIT Linear Algebra - Lecture 1615 - Projection Matrices andProjections Leastonto SquaresSubspaces]
8

edits