Methods of Analysis

There is a wide range of statistical methods for analyzing electoral data. Researchers of electoral statistics offered the following categories at the "Round Table of Mathematicians" held in the framework of the Winter School of Observers (Poland, December 2017)

 

Unimodality

The distribution of random variables is often unimodal. Examples of unimodal distributions are the Gaussian distribution, the Poisson distribution. Usually, the turnout, the votes for candidates / parties, invalid ballots, and the number of early votes are considered to be random values in the voting data.

Related articles:
Statistical anomalies in 2011–2012 Russian elections revealed by 2D correlation analysis (D.Kobak, S.Shpilkin, M. Pshenichnikov, 2012)
Field experiment estimate of electoral fraud in Russian parliamentary elections (R.Enikolopov, V.Korovkin, M.Petrova, K.Sonin, and A.Zakharov, 2013)

 

Dependence of results on the turnout

A popular method of analyzing electoral data based on the Sobyanin-Sukhovolsky theorem (1995). The method allows to arouse suspicions of stuffing, multiple voting as well as legit types of coercive or mobilized voting.

​Related articles:
Statistical anomalies in 2011–2012 Russian elections revealed by 2D correlation analysis (D.Kobak, S.Shpilkin, M. Pshenichnikov, 2012)
Integer percentages as electoral falsification fingerprints (D.Kobak, S.Shpilkin, M. Pshenichnikov, 2016)

Kiesling-Shpilkin method

Analysis of the ratio of votes cast by candidates / parties, depending on the turnout. It was first used by Kiesling (John Brady Kiesling) in 2004 to analyze elections in Armenia. The method has proven itself useful in Russia to identify fraud in favour of the ruling party and its candidates.

Related articles:
Charting Electoral Fraud: Turnout Distribution Analysis as a Tool for Election Assessment (J.B.Kiesling, 2004)
Russian Elections Under Statistical Scrutiny (S.Shpilkin, 2016)

 

Churov's Saw

Increased number of polling stations with round results of the ruling party and its candidates (55%, 60%, 65%, etc.). Scatter-plots visualize this phenomenon as clusters. Histograms of distribution show them as peaks at regular intervals.

Related articles:
Putin’s peaks: Russian election data revisited (D.Kobak, S.Shpilkin, M. Pshenichnikov, 2018)
Integer percentages as electoral falsification fingerprints (D.Kobak, S.Shpilkin, M. Pshenichnikov, 2016)
Russian Elections Under Statistical Scrutiny (S.Shpilkin, 2016)

 

The Last Digit

In cases of sufficiently large values, the last digit of the numerical values must obey the law of random distribution. The method has been used by Beber (Bernd Beber) in 2008 for the analysis of Nigerian and later by Myatlev for the analysis of Russian elections.
Related articles:
What the Numbers Say: A Digit-Based Test for Election Fraud (B.Beber and A.Scacco, 2012)
Russian Elections Under Statistical Scrutiny (S.Shpilkin, 2016)

 

 Improbable clusters

Increased number of plots with given results that coincide with an accuracy of hundredths of a percent. It is a sign of doctoring for predetermined results. Scatter-plots visualize the phenomenon as clusters. Histograms of distribution reveal them as peaks. On the Gabdulvaleev's diagrams, the abnormalities are plotted as chains of dots.
Related articles:
Russian Elections Under Statistical Scrutiny (S.Shpilkin, 2016)

 

Reverse Engineering

Detection of fabrication in official results in cases when the data by polling stations are not published. The artificiality of the data is detected by unnaturally short decimal fractions of the values. This happens when the data is falsified reversely: first, a given final total result is ordered by the authorities, then initial values are calculated. Examples are anomalies found in the Donetsk, Lugansk and Syria. 

 

Impossible Arithmetic

The fabrication of the results is revealed by the physical impossibility of the published values. For example, the number of ballots issued at a polling station is much less than the number of ballots found in a stationary ballot boxes at the polling station.

 

Parallel Elections

Different elections that are held simultaneously are falsified on different scale. Discrepancies in the official data of such elections allow to detect manipulations.

 

Retrospective Analysis

Comparison of data between different elections or rounds of the same elections in the constituency / region. Fair elections at some time allow to confirm fraud in the recent past elections. Examples include the election of the Duma and the President in Moscow constituency (2011-2012) or the I and II rounds of the Presidential elections in Ukraine in 2019 in Donetsk constituency.

 

Regression Coefficient Dependency on Turnout

The improved Sobyanin-Sukhovsky method proposed by A. Buzin

 

Invalid Ballots Test

The study of the proportions between the number of invalid ballots and other results. The method allows  to quantify the level of fraud.

 

Official Turnout Dynamics

Falsifications are detected by anomalies of the dynamics of the official turnout.

 

Impact of Observers

The correlation of election results and the presence of observers. Such a correlation should not be detected in fair elections.

Related articles:
Field experiment estimate of electoral fraud in Russian parliamentary elections (R.Enikolopov, V.Korovkin, M.Petrova, K.Sonin, and A.Zakharov, 2013)

 

Impact of Electronic Voting Machines

The usage of electronic devices at the polling stations may correlate with the low results of the ruling party or its candidates because conventional fraud methods are difficult or impossible to employ. Such correlations should be observed if the elections are fair.

​Related articles:
Statistical anomalies in 2011–2012 Russian elections revealed by 2D correlation analysis (D.Kobak, S.Shpilkin, M. Pshenichnikov, 2012)

 

Change of Correlation Trend

Method proposed by B. Ovchinnikov

 

Geographical Anomalies

The study of geographic anomalies in the results is a separate and comprehensive topic. Anomalies can be either sharp differences in the results across administrative boundaries, or sharp differences in results within administrative units and / or constituencies. Partially the method intersects with the method related to the unimodality of random variables.
Related articles:
Russian Elections Under Statistical Scrutiny (S.Shpilkin, 2016)

 

Impact of Video Observation

Correlations in the voting results and the presence of video broadcast equipment. Such correlations should not be observed in fair elections. Dependency arises from the fear of falsifiers to be detected by video observers during broadcast or in records. The effect is similar to impact of present observers. 

Related articles:
With Cameras and Without. How Video Monitoring Influences Voter Turnout (D.Kankiya, 2019)

 

Visualization tools used in these methods include:

The Last Digit

In a random set of integer values, the last digit should occur with equal probability: 1/10. Deviation from randomness might be legit or might be a sign of artificial interference.

Method Description

Last Digit Analyzer in the Lab

Scatter-plot

Scatter-plot diagram visualizes correlations of two chosen values. Each pair of values, most often they are a subset of voting results by PECs, corresponds to a point whose coordinates are equal to the values. Uniform distributions are expected, without clusters, lines, abnormalities and artefacts.

Method Description

Scatter-Plotter in the Lab

Histogram

The values under study are aggregated by intervals (bins). The aggregated sums are proportional to the heights of the bars. Most of the expected distributions are expected to be unimodal.

Method Description

Histogram Generator in Laboratory