Note 1: This post may get updated as additional information on Ankara’s election comes in.
Note 2: This post has now been updated with data from Istanbul – see here)
Note 3: Added two graphs showing party-specific relationships between vote shares and invalid ballot shares. Hat tip for doing these kinds of graphs comes from Twitter user @merenbey.
Note 4: Added heterogenous results showing CHP being penalized by higher invalid shares of ballots much more in above-median pro-CHP districts than in below-median pro-CHP districts.
Having seen tweets on numerous alleged voting irregularities in Turkey and thanks to Twitter user @erenyanik I came across this CHP/STS dataset of voting data in the Greater Municipality of Ankara, one of the tightly contested (less than a percentage point in the vote share) mayor elections between Melih Gökçek and Mansur Yavaş. The dataset includes 12,230 ballot boxes across 1,682 voting locations in 25 districts in Ankara. I didn’t collect the data itself and therefore this analysis should be taken as highly preliminary.
It all started when @erenyanik posted this picture plotting ballot box level AKP and CHP vote shares against the turnout rate for the Ankara mayor race. Several of the ballot boxes revealed turnout rates above 100 percent, which is strange, but also that these tended to systematically favor the AKP. I decided to create a couple of graphs myself, and per request am now typing this very basic analysis up. Instead of showing both AKP and CHP vote shares like @erenyanik does, I show the difference between these, the AKP-CHP win margin. In the graph below is also a superimposed vanilla local polynomial (lpolyci for STATA users) with accompanying 95 percent confidence intervals:
(Note: A similar graph using the AKP vote share instead of the AKP-CHP vote margin can be found here.)
Each plot represents a ballot box. The graph shows somewhat of a negative correlation between turnout and AKP-CHP win margin until turnout is above 100 percent. In places where turnout is recorded as above 100 percent, the votes are clearly in AKP’s favor. One reason for the negative relationship between turnout and relative voting for AKP could be because areas that vote more for AKP than CHP tend be those with lower turnover rates for various reasons.
This looks interesting but does not necessarily constitute an irregularity. What is a bit strange though is the ballot boxes with more than 100 percent turnout and its association with a lean towards the AKP. This could be a sign of blatant ballot stuffing of AKP votes and is therefore noteworthy, but the amount of ballot boxes is rather small and unlikely to have any meaningful effect on the end result. (It is for example too small to account for the vote margin, roughly 30,000 votes at the time of writing this). As such, this is either a mistake in the data, or a real irregularity but with limited implications. Moreover, several of these boxes, but not all, tend to hold small amounts of ballots and could thus climb up above 100 percent as e.g. non-registered election officials cast their votes. A reason why these tend to have systematically higher vote shares for the AKP may be because these are also less-populated places, villages, and perhaps also more socially conservative and poorer – characteristics making them more likely to vote for the AKP.
Instead, I plotted the AKP’s share against the share of all ballots declared as invalid. In contrast to a focus on a the small number of abnormally high turnout ballot boxes, nearly all ballot boxes will have some invalid votes, and so any manipulation here would have much larger implications. The intuition is that instead of clumsily adding a bunch of ballots for the AKP, ballots for the CHP could be selectively assigned as invalid in ways that could signal manipulation. The graph looks like this: And for Istanbul: (Note: A similar graph using the AKP vote share instead of the AKP-CHP vote margin can be found here.)
The interesting part of this graph is the positive relationship between the AKP-CHP vote share difference and the invalid share of ballots. Places where a higher fraction of ballots are declared invalid have systematically higher votes for AKP relative to those for the CHP.
A naive conclusion would be that this is definite evidence of voter irregularities, i.e. that AKP (relative to CHP) vote shares are higher because CHP votes get thrown out as invalid. It’s naive because there are a range of factors that are omitted in this analysis, and thus the above relationship is likely subject to an endogeneity problem. For example, constituencies voting to a higher degree for AKP instead of CHP could also be poorer, less educated, and therefore more likely to make mistakes (resulting in more invalid ballots). An unconditional correlation between the AKP-CHP vote margin and the invalid share of ballots may suffer from an upward bias.
In the absence of data on relevant data on education and income etc, here’s a very basic way to account for this: add fixed effects. Since the dataset includes variables for A) Districts (ilce) and B) Location (or voting stations, alan), I can regress the AKP-CHP vote margin on the invalid share of ballots while controlling for fixed effects related to either A or B. This account for factors that vary across, but not within, districts (or locations). Below are results from regressing the AKP vote share (akpchpdiff) on the invalid share of ballots (invshr), using the following three specifications: column 1 reports results from a simple unconditional regression of akpchpdiff on invshr, whereas columns 2 and 3 add district (ilce) and alan (voting station) fixed effects respectively. In all the specifications, the standard errors are clustered at the corresponding fixed effects level. I show the results for both Ankara and Istanbul.
The coefficient 5.9 (for Ankara) means that a 1 percentage point increase in the share of invalid ballots is associated with a 5.9 percentage point increase in the AKP-CHP vote margin. As stated above, there is little reason to put much trust into this coefficient, because there could be other factors driving this correlation. Controlling for ilce-level fixed effects in column 2 results in estimates that are also quite large. The fact that it is smaller is consistent with the possibility that estimates in column 1 are biased upwards due to omitted variable. The estimate now measures how the AKP-CHP win margin changes with the invalid share of ballots when all factors that are constant within districts are controlled for. In other words, it compares variation within districts. If there is a lot of variation within districts with regards to omitted factors like education and income, this estimate could still be biased.
Adding alan-specific fixed effects in column 3 is a rather demanding action. It controls for all factors that vary across locations (i.e. voting stations), leaving only the remaining variation across ballot boxes within locations. It is much more difficult to argue that there would be systematic differences in omitted factors across ballots within a voting station than in previous specifications. It is thus noteworthy that the estimate, albeit much smaller, is still statistically significant. The fact that the estimates change across specifications may be an indication of the omitted variables confounding the analysis in the unconditional specification.
Although these fixed effects are no solution to the endogeneity problem, it is quite striking that the results are this robust. This means that even within a particular voting station where votes are cast – like a school etc – ballot boxes with more invalid votes tend to have an AKP bias.
An equivalent way to show these results is to run the regressions using invividual party vote shares instead of the AKP-CHP win margin. Below, in graphs for Ankara and Istanbul, I run separate regressions for each party, regressing its vote share on the invalid share of ballots with voting station fixed effects.
and for Istanbul:
Both these graphs rather strikingly show how a larger shares, even within a specific voting station, appears to penalize the CHP to the benefit of AKP.
This raises the question whether we might observe differential results in ares where CHP receives a lot of votes versus areas where they don’t. If we see that the CHP loses more votes as a result of ballots declared invalid in more pro-CHP districts than in more pro-AKP district, this could be a further sign that something is fishy.
What I do below is to calculate the district level CHP vote share. I then split the sample at the median of this vote share, creating one below-median sample (i.e. ballot boxes in districts supporting the AKP more) and another above-median sample (i.e. ballot boxes in districts supporting the CHP more). I then, for each of these sample, regress the AKP-CHP win margin on the invalid share of ballots.
and for Istanbul:
What is noteworthy here, and especially for Ankara, is how the coefficient on invalid share of ballots is much higher in pro-CHP districts (column 2) than in pro-AKP districts (column 1). It is mostly in the former districts where a higher share of invalid ballots seem to particularly penalize the CHP’s vote share to the benefit of the AKP.
Suppose for a moment that what I have documented here is actually voting manipulation, that ballots for the CHP get systematically thrown out. Then, in order to actually affect the election outcome, one would particularly want to do this in places where the CHP receives a lot of votes. This heterogenous result above would thus be consistent with this.
Indeed, if this is a sign of voting manipulation, it is much more subtle than messing around with specific ballot boxes and would allow manipulation that may very well be difficult to spot in specific boxes. If just a few ballots are declared wrongfully invalid in a lot of boxes, this could add up to a lot of votes and have a real impact on the election outcome.
However, at the moment this remains a statistical anomaly, which could potentially have another explanation not involving voting manipulation. As such this is not definite proof of extensive voting irregularities, but I think it is enough to warrant a more detailed investigation of Turkey’s local elections, at least in Ankara.