Last week, a Wichita State University statistician filed a lawsuit regarding the Kansas 2014 election. She is trying to get access to vote machine tallies, to rule out the potential of voter fraud. Here's a link to the article in the Wichita Eagle.
The statistician, who works as a QA engineer, has found some "voting anomalies"... essentially that republicans receive larger than expected vote shares in larger precincts. Keep in mind that QA (Quality Assurance) engineers are trained to look for anomalies, things you wouldn't expect in data, and make a big deal out of them so that systems don't fail.
Of note from the article:
“This is not just an anomaly that occurred in one place,” Clarkson said. “It is a pattern that has occurred repeatedly in elections across the United States.”AND
The pattern could be voter fraud or a demographic trend that has not been picked up by extensive polling, she said.On face, as a research statistician, this doesn't seem like that big of deal. Just a researcher looking into anomalies. I did think that putting "voter fraud" out there as a possibility seems a little aggressive at this point, but didn't see this as an issue that would get a lot of attention. But consider the political climate of Kansas:
- I've posted on this before, but the political climate in Kansas right now is really tense, largely due to a highly contested election.
- Same post from before, but progressives are especially upset because they just lost an election, which their leaders told them would be a fairly easy win.
- I've seen the article above posted by many progressive friends as fodder, evidence, and proof that, statistically speaking, Brownback was probably re-elected due to election fraud.
What we know
First, from Clarkson's comments, she has no evidence of voter fraud. She has found a small statistical anomaly, that exists nationwide and wants to use Kansas to verify that it isn't due to fraud or voting machine issues.
But what is that anomaly?
The anomaly is that after a certain size threshold (500), there is a positive correlation between precinct size and percent republican votes.
Why is that an issue?
Clarkson is a QA engineer and in anomaly detection mode. She's starting from an a priori premise that precinct size should not determine results, and thus, statistically significant correlations should not exist.
Is Clarkson's analysis of the data correct?
Though disagreeing with her conclusions, I tend to think the mechanics of her analysis are correct. In fact, I was able to replicate, using the 2010 Kansas Gubernatorial Election. The relationship is weak statistically speaking, but statistically significant which indicates that something non-random is happening in the data. Regression stats and visual plot below.
And here's what a sample of Clarkson's work with an Ohio example looks like:
So the correlations exist, and then people must be acting nefariously in those large districts right?
Here we go. Absolutely not. And here's why: covariates. In the real world, multiple variables often correlate with one another, causing us to find relationships, that are really measuring something else. Clarkson's comments allude to this when she talks about potentially underlying and undetermined demographic factors.
There are many what-if's here. What if other variables also correlate with precinct size? Age, Race, Wealth, Urbanity, etc, etc, etc. In these cases we are latently measuring other factors through measuring precinct size. Another specific issue, what if more conservative populaces somehow push for fewer, larger precincts and less division? Keep in mind, this is a WEAK relationship we are trying to explain.
The authors of this analysis already drop the smallest precincts as a whole, because they tend to be more rural, and thus more Republican. In essence, the authors tacitly admit underlying demographic factors can impact the correlation between precinct size and voting behavior. We haven't gone through the steps to exclude all other demographic factors, so why are we making vague accusations of fraud and making a big deal of this in the press?
Partially speculation here, but it is much more likely that the correlation found is due to underlying demographics and other covariates, rather than something nefarious going on with voter machines.
This is an interesting area of research, and I will likely post on this again when I get access to more data. I think it is quite likely, that this is due to underlying demographic drivers.
I absolutely think Clarkson should have access to the vote machine records, as well as the software itself for testing. But, the way this has been handled by Clarkson and the press at this point is pretty reckless. In the political environment that is Kansas, where many believe the election was stolen or rigged, this "evidence" is being handled by many people as more evidence of fraud by the administration, while we really have no real evidence for that.