Tuesday, October 13, 2015

Royals Twitter Performance: Tracking A Comeback

A couple of months ago, I posed an open question about how well sports team performance correlates to Tweet outputs and sentiments.  I generally demonstrated low-level but significant correlations from last season to this season.  But what about in-game correlations and observations; do Twitter sentiments and volume shift in-game in relation to outcomes in the game?  Yesterday's playoff game between the Royals and the Astros (a game with a major shift of performance late in the game) provided a great test.


I downloaded all the tweets using the hashtags #Astros and #Royals yesterday between noon and 5pm central time.  I cleaned the tweets (stemmed, removed words, etc) down to standardized data.  I also sentiment mined the tweets and categorized them into 20-minute interval buckets.


The first thing I looked at was the volume of tweets by time period throughout the afternoon, by each team's hashtag.

 Note that the #Royals hashtag is generally "beating" the #Astros hashtag throughout the afternoon, then an explosion of #Astros tweets around 2:40 followed by an explosion of #Royals tweets later in the afternoon. Here's what was going on in the game.

12:00-2:40 Pregame, and the first six innings of the game.  They Royals and Astros play a tight first six innings, with the Royals leading in the second inning, with a slight Astros comeback, the Astros lead 3-2 going into the bottom of the seventh.

2:40-3:00 The Astros score three runs in the bottom of the seventh to take a commanding lead.  At this point teams win over 95% of the time with a four run lead with two innings left.  Here's a flavor of #Astros tweets during this time period.

3:00-5:00 The Royals create a massive comeback in the eight inning, scoring five runs, with an additional two in the ninth and win the game.

Looking at data this way, we can bucketize the data into three main times of the game.  Prior to the Astros "big lead", during the Astros lead, and then during the Royals comeback and afterwards.  Here's what that looks like by tweet volume.

This chart demonstrates the correlation between performance and tweet volume, with #Royals outperforming #Astros 2:1 in the later stages of the game.

There's no clear explanation for #Royals outperforming early in the game, though a couple of testable hypotheses:
  • Away teams see more tweet volume, because people can't be "at the game."
  • Some teams just have better Twitter presences.
For one last test, I sentiment mined the tweets for the positivity versus negativity.   These results are less significant, but they do show that the Royals fans were more negative when they were losing, but positivity spiked during the comeback.  

No comments:

Post a Comment