Tuesday, November 24, 2015

The Different Ways We Talk About Candidates

A couple of months ago I found a website with extremely rich data, an event which usually makes me very happy.  This website didn't have that effect on me.  I was trying to figure out the weight of a specific baseball player, and stumbled upon a database of detailed celebrity body measurements (all women, of course), found here.  Later I found that data included political candidates, and it raised a question in my mind about the different ways we talk about men and women in politics.

Simultaneously, I was looking for a way to measure the presence of certain ideas across the internet.  I can already measure sentiments and topics on twitter, but Twitter is only a portion of the internet, and most people access the internet through Google search when seeking out new information.  Could I write code that would start my text mining operations through Google Search?

THE TEST

(NON-Nerds Skip this)

I had a social idea (how we talk about candidates based on gender) and a coding/statistical concept to test: to mine google search results.  I went forward with a formalized test plan:
  • I would use the google search API to pull results for "Candidate's Name" + Body Measurements.
  • I would capture the data and turn it into mine-able text.
  • I would compare the results of top words, and generally compare them.  (note: rate limits on the Google API as well as some Google restrictions slow me down, in the future I may apply more sophisticated text mining techniques).
I wrote some code pull the Google Search results, the google API only allows us to pull 4 results at a time, so I wrote a loop to pull four at a time.  Here's what that looks like (building step by step for ease of understanding):



DATA RESULTS

So what are the results of googling Candidate Names + Body Measurements?  I googled four candidates, two men, two women.  My observations:
  • Men: The men's results were generally about the campaign, with each returning a few references to BMI (Body Mass Index).
  • Women: The women's results were heavily focused on the size of their bodies.  In fact, the top four words for each women were the same: size, weight, height, and bra.  


This table shows the top 10 words returned for each candidate.  This is obviously on a small sample size (four candidates, only top 44 google results for each) but is interesting nonetheless.  


And because I know everyone likes wordclouds (sigh) I created wordclouds for each candidate at the bottom of this post, below conclusion.

CONCLUSION

Some final takeaways from this analysis:
  • It's definitely possible to use text mine google results in order to find prevalence on the internet.  I probably need to refine my methodology in the future, and obviously implement more sophisticated techniques, but the basic scraping method is complete.  
  • There exists relatively little information on the internet regarding the body measurements of male candidates.  And I really wanted to know Ben Carson's waist to hip ratio!
  • Female candidates are talked about online a lot more in terms of their body.  I'm not an expert in feminist discourse analysis, or even really qualified to give an opinion here, but I have certainly measured a difference in the way candidates are talked about online.




BEN CARSON

HILLARY CLINTON


CARLY FIORINA
BERNIE SANDERS


Friday, November 20, 2015

Corrected Polling Numbers

A few weeks ago I posted a fairly hefty critique of a survey conducted by Fort Hays State University researchers on the political climate in Kansas.  The survey claimed a lot of things, but the issue receiving the most press was that Kansas Governor Brownback had an 18% approval rate.  I took issue with that number for various reasons, largely due demographic skews in the data, hinting at sampling or response bias.

ACTUAL APPROVAL RATE?

Sometime later a twitter user asked me, if not 18%, what do I really think Brownback's approval rating might be.  I looked again at the skews, did some quick math, adjusting for prior demographic distributions and likely errors and came up with a range.  This was me really just trying to back into a number from bad polling data.  Here's my response on twitter:

WARNING: "I TOLD YOU SO" coming. 

This week another survey was published that reviews the approval rate of all governors in the US.  You can find that study here.  I haven't fully vetted the methodology, but the methodology indicates they at least tried to deal with demographic issues.

What did that study tell us?
Brownback's approval rate is 26%.  LOOK THAT'S IN MY RANGE!
But that dataset also provides information on other governor approval ratings, what can those tell us?

COMPARISON TO OTHER GOVERNORS

While I was correct that Brownback's likely approval rate is above 18%, his approval rate is still dismal compared to other governors.  In fact Brownback is 9 percentage points below any other governor, and a huge outlier.  I could bore you with p-values and z-scores (-2.8) and other statistical nerdery, but two charts can easily describe how bad his approval rate is. (Brownback in red)




CONCLUSION

Takeaways bullets:
  • Brownback's approval rate is likely above 18%, closer to 26% (read: I was right).
  • Brownback has the lowest approval rate among US governors.
  • Brownback's approval rating is an extreme low outlier.  

Wednesday, November 18, 2015

Tragedy Hipsters and the Connection of Mizzou to Paris

Over the weekend I was confronted with a new term that needed no explanation: Tragedy Hipster.  The term refers to people who respond to tragedies by mentioning other tragedies, and this behavior was prevalent over the weekend.

Specifically, there were a string of tweets from Mizzou protesters complaining about the world being more upset about Paris attacks (120 deaths) than issues surrounding Mizzou protests earlier in the week (racism).  Other "hipsters" were mentioning the Beirut bombings which received much less press coverage than Paris.  Later in the weekend, I noted another string of tweets referencing both Paris and Mizzou that tended to be from conservatives complaining about the earlier tragedy hipsters.  All fascinating to me.

So a few data questions emerged, can we identify tragedy hipster behavior in data?  Can we differentiate the hipsters from their critics?  How much of the linkage between Paris and Mizzou was the original hipsters versus their critics?  I looked at twitter data to examine the behavior of tragedy hipsters.

A few takeaways from the data:
  • There was a high amount of tweet traffic following the initial event comparing Mizzou to Paris.
  • Much of that first wave could be considered "Tragedy Hipsters" while others were just asking prayers for both.
  • Following that initial wave, the tweets were mainly from conservatives criticizing the initial wave.

TRAGEDY HIPSTERS

First a definition: 
Tragedy Hipsters-People who react to the initial news of a tragedy by trying to cite a cooler tragedy.  (Side note:"cooler" might mean lesser known, more deaths, better cause, etc.)
This usage is a direct corollary to the way music hipsters talk about music, by trying to reference a "cooler" (read: more obscure, harder core, weirder) band whenever someone brings up music.  The music instance usually goes something like this:
Person 1: Hey have you heard the new Red Fang album?
Hipster: Red Fang sucks, they're just a rip off of the band Sleep.
Person 1: Whatever, hipster.
If that's what music hipster behavior looks like, what does a tragedy hipster look like?  Topic mining the tweets over the weekend containing the words "Paris" and "Mizzou" found a specific topic (topic 3, see below) that was both direct and indirect Tragedy Hipster Behavior.  Here are some examples:

Direct comparisons:

Multi-sympathetic: 

 TWITTER DATA

Let's mine some data, shall we? Here's what I did:
  • Downloaded tweets from the beginning of the attack until Tuesday at noon that contained both "Paris" and "Mizzou"
  • Topic mined the data to find true topics underlying tweets, looking for tragedy hipsters versus their backlash.
  • Analyzed topics used and how they changed over time.
  • Sentiment mined the data for emotions.
First a word cloud, just to identify the top words.  Unsurprisingly, "attack" is the top word overall, but it's closely followed by "spotlight," "stole," and "unbelievable."



A topic model might be helpful to understand why the weird terms are used, it converged around five topics:



On analyzing the tweets, topics 1,2,4,5 were mainly made of conservative criticisms "Tragedy Hipster" behavior, and topic 3 was... well tragedy hipsters.  By topic:

  • Topic 1: TCOT (Top Conservatives on Twitter)/foxnews references criticizing Mizzou protestors.
  • Topic 2: People talking about Mizzou activists having their spotlight stolen.
  • Topic 3: Tragedy Hipster behavior, a good amount of it showing legitimate sympathy/empathy.
  • Topic 4: Another topic of conservatives making fun of Mizzou protesters.
  • Topic 5: Another topic of student being mad about losing their media spotlight.

Because the "Tragedy Hipster" topic segregated so well from other topics, we can plot topic usage over time to show how the situation evolved. The chart below shows time in three hour blocks (UTC) and the proportion in each topic.




Note that topic three dominated the conversation for the first few hours after the attacks, but was supplanted by the other four topics over time.  Specifically, the initial ratio was 4:1 Tragedy Hipster to Conservative, but after the first day, the ratio reversed to between 1:4 and 1:9.  Overall, the total reaction has been 30% tragedy hipster, 70% conservative backlash.  In essence, the initial hipster reaction regarding Mizzou and Paris led to a few days of criticism from conservatives.

I embedded tweets for topic 3 above, so it's only fair I embed a couple of tweets associated with the other four topics:




I did one last thing to the data: sentiment mining.  Nothing too much of note here, except the emotion expressed was significantly angrier than other sets of tweets I've looked at, including Kansas Legislature Tweets and the Royals.


CONCLUSION

The concept of a tragedy hipster is somewhat fascinating.  On one hand, I understand the part of human nature that leads us to react strongly to tragedies that seem closest to them (white people dying in a western country). On the other hand, I understand the part of human nature that, when faced with a tragedy getting attention, point out another tragedy that is *worse* in someway, or closer to them (Mizzou students seeing discrimination, versus people they don't know dying in Paris).

But are there really any takeaways from this?  In my mind:
  • The initial tragedy hipster tweets were simultaneously overshadowed and made more popular by their conservative reactions.
  • This whole situation created a lot of anger between groups, possibly undermining progress made in the Mizzou protests.
  • If prone to tragedy hipster type behavior, you should be aware of optics, because there will be a backlash, and many may attempt to make you look foolish.

One last thing, for fun I wrote some code to auto-create wordclouds based on each topic. Here are the five for this analysis.  A couple of notable topics:
  • Topic 1 is a mess of conservative symbols, due to the repeated tweets of some conservative commentators.  
  • Topic 3 is notable in being much different (as noted before) than the other three topics.

TOPIC 1

TOPIC 2

TOPIC 3 
TOPIC 4

TOPIC 5

Friday, November 13, 2015

Mizzou Protest Mining

I don't want to offer an opinion on the situation at Mizzou on this blog, however, interest in the situation sent me down the path of text mining tweets today. I downloaded last two days of tweets with hashtag #mizzou and used my normal methodology  (download tweets, sentiment, topic mining)  Some takeaways (I'm just going to show data, not too many words after this):

  • Negative Polarity: Tweet polarity-wise, this is the most negative set of tweets I've ever mined.  More negative than government and education in Kansas, much more negative than the Royals.
  • Recently Negative: The tweets I analyzed and topics found are negative towards the protesters, but that could be because I pulled the last two days. If I looked earlier in the week I would likely see different results. Also if I used other hashtags (e.g. #concernedstudent1950, #blacklivesmatter, #millionstudentmarch) I would likely see much different results.
  • #TCOT Presence: I noted a large presence of TCOT (top conservatives on twitter) in the tweets I downloaded, especially in a couple of discovered topics.  The presence of this hashtag at high intervals tells me that conservatives have been widely using the #mizzou hashtag.  

DATA

Largely without comment.  First, I searched the hashtag #Mizzou and these were the top two results (for flavor).  




And a wordcloud of all the tweets.  The words center on "students," with a few hashtags like #blacklivesmatter and #millionstudentmarch. The rest of the terms focus on general racial  words with "college" terms also getting high marks.



Polarity 


A few months ago I compared polarity of three hashtags I commonly analyze (Kansas Government, Kansas Education, and the Royals).



Today I did the same analysis for #mizzou.  Much higher percentage negative than even Kansas Government.



Finally I created a topic model.  Generally speaking, each topic contained both positive and negative comments about the  protesters, with the majority leaning negative.  The most positive towards the protesters was topic three.  If I had to name the discovered topics they would be:


  1. Truth/Lies of the protesters.
  2. TCOT.
  3. Support of protesters.
  4. Free Speech and Football.
  5. Generally making fun of the protesters.


Here's my term printout from R:



And each topic with it's top correlated tweet:


TOPIC 1: "Lies" of protesters

TOPIC 2: "TCOT" 




TOPIC 3: Support of protesters.



TOPIC 4: Free Speech and Football

TOPIC 5: Generally making fun of protesters.

Thursday, November 12, 2015

GOP DEBATE NUMBER.. I don't know four at this point?.. Whatever

Earlier this week I watched the Fox Business Channel Republic debate.  Well, I watched about twenty minutes of it, and then decided to go back to a little woodworking project I have right now.  I am really starting to lose steam in watching these debates, because they're repetitive and boring.

Anyways, I think it was the fourth debate?  Something like that.  Who knows at this point?  I gave up on this one too early to really know what happened.  Hey let's try to extrapolate reality from Twitter, shall we?

DATA/ANALYSIS

Same methodology as before, downloaded tweets from the debate following the debate, then ran them through various algorithms.  

Donald Trump is the candidate most mentioned, and at the center of everything, along with Ben Carson and Marco Rubio.  It's pseudo-sarcastic wordcloud time!  Bush and Cruz are in there too, but Fiorina is strangely missing from the cloud.  

Trump is still receiving mentions at 2:1 rate as other candidates (Rubio and Carson).  Rubio and Carson are virtually tied.  Fiorina has gone from the #2 candidate of interest to drawing about 10% of Trump.  Trump and Bush have the lowest negative tweet percentage.




Though not generating a lot of her own traffic, Fiorina is generating a lot of ReTweets and Favorites per original tweet.  I calculated a new metric here, which is number of ReTweets and Favorites per tweet about each candidate.  The interesting thing here, is that while Fiorina dropped off the map content-wise, the tweets about her get a more attention than everyone else (excluding Trump).



People refer to Carly Fiorina by her first name much more than any other candidate.  This is a fun metric I put together after someone alerted me to the different ways we address presidential candidates.  Essentially: the one female candidate in the race is much more likely to be referred to by her first name than the other top candidates.  The closest candidate is Jeb Bush, who at one time had a campaign slogan that was simply "Jeb!".  This could be because Fiorina's last name is somewhat hard to spell, or that she prefers Carly, but it is certainly interesting that the one woman  in the race is addressed this differently.


Monday, November 9, 2015

Trump Should Have Hosted SNL Three Months Ago

Twice in the last week people have made arguments to me of the form "everyone I know thinks this, so it must be true."  The nature of the specific arguments were very different though:
  • Everyone I know hates Sam Brownback, so he couldn't have won re-election without fraud.
  • Everyone I know is voting for Donald Trump, so he will certainly be our next President.
I have talked about the Brownback re-election enough on this blog, but Trump only once, and Trump was just on SNL, so why not take a look?

TRUMP ON SNL

I enjoyed Trump on SNL, not because I thought Trump was great, but because the non-Trump elements were good.  My top moments of the night:
  • Bobby Moynihan as Drunk Uncle: I have liked this character for a while now, as it reminds me of my interactions with.. well.. a lot of people.
  • Larry David heckling Donald Trump: Making light of the real bounty on interrupting the show that night, but also Larry David being hilarious.
  • Larry David as Bernie Sanders: I want your vacuum pennies!
  • Donald Trump as Music Producer: Actually wanted more Trump in this sketch and less of the Dad character.  Maybe a role Trump was born to play.  
That's somewhat humorous, but prior to the show I tweeted out:


My tweet was met with a couple of Trump supporters telling me I didn't know what I was talking about, and the logical fallacy of "Everyone I know..."  But what do the numbers really say?

CONCLUSION: THE DATA

The "everyone I know" logic is wrong for a couple of reasons:

  • It's almost always based on a sample or 20-30 people that someone talks to on a regular basis, statistically too small for large-scale inference.
  • The sample of "everyone you know" is biased by the people you choose to associate with, demographics of where you live, and how people filter what they say to their friends (yes people lie about their politics not to offend, especially faced with passionate people). 

Now that I have that out of my system, what do recent polls say about Trump's popularity?  More importantly will Donald Trump win the Republican Nomination and eventually the presidency?  I like the view below, because individual polls can show bias, but using moving averages can mitigate the bias of an individual poll.


The view shows us a few things:
  • Trump entered the race in April and showed a steady improvement in polling numbers through August, topic out at nearly 30%, with a 12% lead over the closest competitor.
  • Since early September, Trump's polling results have been more mixed, sometimes losing to other competitors.  His current effective polling lead has shrunk to less than 5%.

These patterns generally match our prior post on Trump's success, effectively that as other candidates move out of the race and others consolidate their base Trump would slowly fade back to the pack. The only positive for Trump in the near future is that his current closest competitor is facing a string of mini-scandals regarding grain storage and stabbing people.

Tuesday, November 3, 2015

Testing Opinion Polls: Do they really measure what they say they do?

**edited 2015-11-05 to include additional demographic information

Generally, I am not a fan of survey research and prefer economic numbers or other data measured not by "calling people and asking them how they feel."  Polls can bring in a lot of bias, not just the normal sampling error that some statisticians are obsessed with measuring and testing against, but also response bias, sampling bias, biases from the way you ask questions etc.  That's not to say that opinion polls and surveys are all worthless (if you want to do one, I know a guy, his name is Ivan, he's great with this stuff).

This is why when developing political models I only partially rely on recent opinion polls, but also heavily weight historic voting trends.  Remember how I used a model with additional data to predict in-margin the Brownback re-election?  (I'll be bringing this up for at least another three years).

A new poll has been making the rounds in Kansas and national media, making claims such as "Obama is more popular in Kansas than Brownback."  Keep in mind that Obama lost Kansas by 21 percentage points in 2012, and Brownback just won re-election in Kansas by about four percentage points.  This is obviously quite a claim, but how seriously should we take it?  Moreover, are there some basic steps we can use to vet how good opinion surveys are.

BACKGROUND: TYPES OF BIAS

So what makes a survey accurate versus inaccurate? The truth is, there are a lot of good ways to mess up a survey.  Here are the general ways surveys are incorrect:
  • Sampling error.  Many statisticians spend a majority of their careers measuring sampling error (this is part of the frequentist versus bayesian debate, and for another post).  Sampling error is the error caused simply by using a sample smaller than the entire population.  Assuming the sample is randomly selected from the population, there will still be a small amount of error.  This is the (+/-)  3% you see in most public opinion poll, though it varies by the size of the sample.
  • Sampling bias. Sampling bias is different than sampling error, though this is an issue that is somewhat difficult to understand.  This is the bias introduced through problems with the process of choosing a random sample of a population.  How does this kind of bias crop up?
    • Bad random number generators.
    • Bad randomization strategies (just choosing top 20% of a list).
    • Bad definition of population (list of phone numbers with people systematically missing)
  • Response Bias.  Response bias is what occurs when certain groups of recipients of a survey respond at different rates than others.  This occurs due to varying "propensity to respond" by demographic or opinion groups within a population.  Examples of how that occurs:
    • Women can be more willing to respond 
    • Older people (retirees) can have more time to respond
    • Minority groups can be less trusting of authorities, and less willing to respond
    • Certain political groups may not trust polling institutions and be less willing to respond
Once again, this is just a starter list of what can go wrong with taking samples within surveys, and I may add to this list as we go, but this is a good primer.

DATA: THE FHSU SURVEY


Let's jump right into study at hand.  The study was conducted by the Docking Institute of Public Affairs at Fort Hays State University.  The study can be found here.

Did the authors in this study consider the error and bias issues?  Absolutely, and they reference it in the study.  Here's a snapshot from their methodology.  



A couple of things from my first section.

  • First, they're referencing a sampling error (3.9%, +/-) for the sample.  That means we know that any number in the survey can be considered accurate within 3.9%.  
  • Second they make a passing reference to response bias, assuming it away.  But how can we test to determine there is no response bias?  Elsewhere in the paper they say that they contacted 1,252 Kansans, and 638 responded.  That means if the 50% that responded are "different" demographically than the 50% that didn't respond, the conclusions of the survey could be misleading.
  • Third there's no reference here to sampling bias, but they de facto address it elsewhere, taking about how they pulled the sample.  The report says: "The survey sample consists of random Kansas landline telephone numbers and cellphone numbers. From September 14th to October 5th, a total of 1,252 Kansas residents were contacted through either landline telephone or cellphone."

Looking at these three sources of potential bias, sampling error is simple math (based on sample size).  Response bias is assumed away by the researchers, and its impossible to know if the list of phone numbers used can create an unbiased sample of Kansans; can we be sure that this sample is an accurate representation of Kansans?

We can never be certain that this is a good sample free of response and sampling bias, but we can do some due diligence to determine if everything looks correct, specifically through fixed values testing.  In essence, there are some numbers that we know about the population through other sources (census data, population metrics, etc) that we can test to make sure the sample matches up.  Let's start with gender.

In the paper on page 39 there's a summary of respondents by gender, both for population and sample.  Keep in mind that the margin of error for this sample is 3.9%, so we would expect gender ratios to fall within this margin.  They do not (5% variance), meaning that the gender differences in this survey can not be attributed to random sampling error.



Also on page 39 is a summary of sample and population by income bracket.  Reported income brackets are a bit fuzzier than reported gender, but the chart below show how those line up.  Because there are multiple categories here, we can't do a simple (3.9% +/-) calculation (technically a binomial test of proportions).  Instead we rely on a test called a chi-squared goodness of fit test to determine if the difference are due to sampling error or an underlying bias.  If the values were statistically similar, we would expect a Chi-Square value of under 14.1.   The test finds that the results exceed the limits of sampling error, and indicate an underlying bias to the results.  


We also have fixed numbers for party affiliation in Kansas per the most recent registration numbers from the Secretary of State.  Those numbers are shown in the below on the left side of the chart, about 45% of Kansans are currently registered as Republican.  On page 39 of the survey we see the reported party affiliations of survey takers.  This analysis is a bit fuzzier because the way people identify doesn't always match their actual party affiliations, but we wouldn't expect that to cause the level of observed deviations in the chart.  As shown below more of the sample respondents responded as unaffiliated, about 12% points fewer Republican, and 5% points fewer democrat.  This also insinuates the sample was significantly less conservative than the registered voters of Kansas.




CONCLUSION


All of the data above speaks to how different the sample was than the general population of Kansas, but what are the takeaways from that?

  • The significant differences in population versus sample demographics undermine the 3.9% margin of error, making it unknown, and potentially much larger.  More concerning, is that the direction of the issues, make it appear that the survey was biased in a way that favored democrats.
  • Significant differences in sample versus population measured values can be indications of other underlying problems with the sample in unmeasured values.  We know that the sample was more female, affluent, left-leaning than the population, could that mean that our sample was biased in a way that made it more urban?  Unknowable with available data, but certainly problematic.
  • The researchers released the paper, with the metrics outside of margin of error, and didn't talk about it.  This is the most troubling part, because in research there are many times that statistical issues like this crop up, but they can be otherwise tested away or quantified in their impact to the margin of error.
My last thought on this:  I agree that Sam Brownback likely has a low approval rate, however the 18% approval rate, as well as other numbers related in the survey are likely an under-statement of his true approval rate, given the bias presented.

**Added 2015-11-05 
After seeing more people cite this survey I realized this survey has been occurring in Kansas since 2010, so I thought I would see if the demographic trends were consistent.  They were consistent, and there were some additional demographics added, specifically: age, race, and education level.  Not going to do a full write-up on these, but they were also significantly inconsistent with population demographics.


Oh and one last thing, I didn't talk about how questions are framed can impact results, and this this survey had one really wonky question in it that I'm not a fan of.  Specifically:

Thinking about what you paid in sales tax, property tax and state income tax together, compared to two years ago, the amount you pay in state taxes has increased, remained the same or decreased?

This question has received some press time, with the byline being that 74% of Kansans now pay more in taxes under Brownback's policies.  Because of the term "Amount" versus "Rate" in the question, I would count myself part of the 74%, but not because of Brownback's policy changes.  I pay more now because I make more money and live in a bigger house, actually an indication of success over the past four years.  I certainly don't think this is what the researchers or the press are purporting to measure.