## Friday, March 17, 2017

A few weeks ago my grad school alma mater (University of Kansas (KU)) won their thirteenth consecutive Big 12 conference championship (I wasn't watching the game, I have better things to do).  Much has been made on how large an outlier this streak is, if performance was random the odds would be about 1 in a trillion to win thirteen straight (not hyperbole, actual probability).

Along with this streak, there have also been some accusations that the University of Kansas receives preferential treatment in Big 12 Basketball, has an unfair home advantage, or outright cheats to win. The home-court advantage is actually staggering, as KU is 75-3 in conference home games over the past nine years, nearly a 95% win rate.

Half joking, I shot off a quick tweet commenting on both the conference win streak and the accusations.  People quickly reacted, KU fans calling me names while Kansas State University fans agreeing, generally, though more willing to charge KU with cheating. The accusations and arguments raise an interesting question: Do certain teams have statistically different home-court advantages, and is the University of Kansas one of those teams?

## METHODOLOGICAL PREMISE

The main issue in calculating home-win-bias is that different teams perform better or more poorly over time, and thus we can't look at simple win rates at home over a series of years.  We need a robust methodology to set expectations for home win percentages, and compare that to actual performance.  As such I devised a method to set expectations based on road wins, and apply that information to each team for analysis.

The underlying premise of this analysis looks at ratios of home win percent to road win percent over-time and calculates the advantage of playing at home for each team, and how it differs from other teams over multiple seasons.  In detail:

• The theory here is that some "home advantages" (KU, specifically) are higher than others either due to natural advantages, out-right cheating, or bush-league behaviors.
• In order to disprove whether home advantages differ, we need a methodology to control for quality of team independent of home performance, and compute home performance in relation to that absolute advantage. Enter predicting home wins using road wins.
• In aggregate, we would expect to be able to predict a team's home win percentage by looking at their road win percentage, as better teams should perform better in both venues.  If a team has a systematic advantage on their home court, we would expect their home win percentage to over-perform the predictive model developed from road wins.
• I build a predictive model to predict home wins based on road wins for a team each year.  The models are developed for each Big 12 school as a hold out model, to remove each school's self-bias in the numbers.  Then I predict the model using the held-out school, calculate the  residuals on the hold out and move to the next school.
• The residuals here represent a Wins Above Expectation metric. We can do two things with the residual data:
1. Calculate the mean residual and distribution over time which indicates the overall home bias of the school (which schools systematically over-perform at home)
2. Determine the best and worst performances at home for individual schools.
The initial models performed well, and show that road wins are fairly predictive of home wins, with a .52 R-squared value (variance accounted for) and a 0.4 elasticity in the log-log specification of the model.

## INITIAL DATA

Starting with a visual inspection of the data, we can get an idea of the relationships between teams, home and away games, and seasons.  First a data point, teams perform far better at home (65.6 average win % wins) than on the road (34.4 average win%) winning nearly twice as often on their home court as on the road. But let's go back to our initial question, does KU win more often than other Big 12 schools at home?  The answer here is yes.

KU outperforms all schools, with the closest neighbor being Missouri (who has a limited sample as they left the conference a few years ago).  We then see a cluster of schools with about 70% home win percentages, and a few bad schools at the end of the distribution (TCU, notably).  This indicates that KU is an outlier in terms of home performance, but is that because KU is a much better team, or indicative of other issues?

Road % helps us answer this question, KU is the best road team in the conference, by a large margin.  Kansas in fact is the only team in the conference with a winning road record over the past decade, winning close to 75% of games on the road.  Even consistent KU rival and NCAA tournament qualifier Iowa State regularly wins fewer than 40% of their road games.

We know that KU wins a lot of games at home and on the road, but is there a way to determine if their home wins exceed a logical expectation?  Before moving on to the modeling that can answer our question, we should prove out an underlying theory: whether road wins and home wins correlate with one another:

The chart and a basic model provide some basic answers:

• Road wins are highly correlated to home wins at a correlation coefficient of .68.
• Few teams (3%) finish a season with more road wins than home wins.

## MODEL RESULTS

With the initial knowledge that KU performs highly both at home and on the road, we can start our model building process. If you're interested in the detailed model, look at the methodology section above.

Using the model to calculate how teams perform relative to peers in terms of home and road wins, I calculated the average home-court-boost, or the number of wins above road-based-expectations, shown below:

Oklahoma State has the largest home-court advantage in the conference, followed by Iowa State, Kansas and Oklahoma.  Each of these schools receive about a full extra-win per season over expectations due to their home-court advantage. TCU has the worst home performance followed by West Viriginia, and Baylor.

A further interesting (and nerdy) way to view the data is a boxplot for each school representing the last ten years of wins-over performance.  This shows that some schools like Kansas and Iowa State have fairly tight distributions representing consistent performance above road expectation.  Other schools, like Kansas State and Baylor, have a wide distribution representing inconsistent home performance related to road expectations.

Using the same scoring method we can score individual year performances, and determine which teams have the best and worst home versus road years.

Most interesting here is that K-State's home-court advantage was a pretty amazing over the years 2014 - 2015.  During those years, Kansas State was 15-3 at home and 3-15 on the road.  At that time at least, it appears Kansas State's Octagon of Doom (I don't remember what it's really called, even though its where I received my Bachelor's degree) was a far greater advantage that KU's Allen Field House.

## CONCLUSION

From the models developed we can reach several conclusions about the types of home advantages held by Big 12 teams:

• The home advantage for the University of Kansas at Allen Field House is high (about +1 game a year) but in-line with several other top-tier Big 12 teams.  This doesn't necessarily fit the story-line that KU cheats at home, but doesn't rule out other theories given by Kansas State fans: that KU cheats/gets unfair deferential treatment both at home AND on the road.
• The top home advantages in the Big 12 are: Oklahoma State, Iowa State, Kansas and Oklahoma.  In fact both Oklahoma State ("Madison Garden of the Plains" .. seriously?) and Iowa State ("Hilton Magic"...) hold moderately larger home advantages than the Kansas at Allen Field House.
• The worst home advantages in the Big 12 are: TCU, West Virginia, and Baylor.
• Some individual team-years show volatile performance, specifically Kansas State through 2014 - 2015.