A few months ago on Twitter, I was criticizing R's ability to create quick, functional, and attractive maps. My essential criticism was this:
In order create good map visualizations, I often have to pull my data out of the R statistical engine and merge it with a shapefile inside of a GIS system like QGIS. QGIS is great, and I can create awesome visualizations in that system.
Other R users jumped in and encouraged me to check out some newer functionality, specifically that found in the ggmap package. This package is related to the popular ggplot2 package that I often use for creating graphical representations of data and models at work and on this blog. Using this new(ish) functionality, I was able to code the following map in just a few minutes, and with just a few lines of code.
|African American % by Precinct, Sedgwick County Kansas|
The code to create this map was straight forward, here are my comments on the capabilities of dealing with GIS data in R:
- readShapeSpatial is a function that allows us to ingest shapefiles into R. Shapefiles are a standard data type for geographical data, for more information see here.
- fortify is a function that we can run against a shapefile to transform the geospatial data into an R data.frame. I would recommend analyzing the output of this process, it is informative about your dataset, as well as how geospatial data "works."
- @data is the classic data element of the shapefile (holds demographics, generally), which we can reference as (shapefile@data) and treat like a data.frame in R (see code below).
- qmap is the analog to ggplot2's qplot. It is a way to quickly create maps, without requiring much syntax or handling. A few things of note:
- The function allows us to underlay google maps against our shapefile, the first parameter here is the text "search" of google maps on which to center our map.
- Zoom we also pick a zoom function, which tells us zoomed in on our search area the map should be. I recommend just playing with this until it looks good.
- geom_polygon is a function that tells qmap what to do with the shapefile. You'll notice that the rest of the syntax looks much like that in ggplot2. If you need help with that type of syntax I recommend this cheat sheet.
#packages library(maptools) library(dplyr) library(plyr) library(geosphere) library(ggmap) library(ggplot2) #grab my shapefile shapefile <- readShapeSpatial("KLRD_2012VotingDistricts.shp") #create id from rownames
shapefile$id <- rownames(shapefile@data) #fortify shapefile, creates a dataframe of shapefile data data <- fortify(shapefile) #join data file to the @data which is the attribute table dbf element of the shapefile data = join(data, shapefile@data, by="id") #subset by FIPS for county level data data <- subset(data,substr(data$VTD_2012,3,5) =="173") #calculate % African American data$AA_PERC <- data$BLACK/data$POPULATION #run qmap for Sedgwick County Kansas qmap('Sedgwick County Kansas', zoom = 10) + geom_polygon(aes(x = long, y = lat, group = group, fill = AA_PERC), data = data, colour = 'white', alpha = .6, size = .3)+ scale_fill_gradient(low = "green",high = "red")