I had promised a second post on gender ratios, but the other post was a bit more interesting, so it took precedence. After looking at the county level last Thursday, I drilled into the census block level. The census block is a lower level of data, of varying geographical size, for which there is detailed census demographic data. What I found was kind of interesting.
FUN WITH MAPS
First a map of gender ratio by census block. The map is shown below.
My main observation was this: larger, more rural census blocks tend to have more men, while smaller (city/town) blocks have more women. But a lot of variance in a map like this. What if we focus on a smaller area. Here's a map of Johnson County Kansas only:
What causes this? I have a few a priori guesses, but nothing strong:
- Males tending to be more comfortable living alone in the woods.
- "Feminized" jobs (read: secretary, nurses,teachers) tend to lie more in cities, denser areas.
- Older (urban core) communities tend to have older populations, so underlying correlation of age:gender ratio takes effect.
- Women tend to move to town after their farmer-husbands die. (seen this in my own family)
That's an ocular analysis of gender skews, any statistical validity though?
Can we model gender ratios by other data? If you want to skip the nerd stuff, here's your short answer:
It can be modeled, and significant predictors found, but the model isn't hugely "predictive."
What factors appeared to matter in predicting gender ratio? Here are our variables:
- Percent Female: Dependent variable. What we're predicting.
- Dense: Population Density. Should be positive, as denser populations seem to lean female.
- Med_age: Median age of population. We know that older populations are more female, so this should be positive.
- Vac_Perc: We assumed that housing availability likely mattered, and assumed that the amount of vacant housing would be negatively correlated to number of females.
- Renter_Perc: Another housing variable, this time percent of housing units that are rented rather than owned. Likely positive to female rate.
So do the models work? Yes and no. Here's the first model, on statewide data:
Because of the statistical power issue, and concerns if different counties behave differently, I also ran the analysis for a sub-sample of counties. Generally the relationships hold up, but are weaker (due to lower N in more rural counties). First, the county I live in, Johnson County:
Next a more rural county, Lincoln County:
A few easy bullet points for a conclusion:
- Gender ratios vary geographically, sometimes in very significant ways.
- At least part of these variations appear to be systematic, and correlated to other variables.
- We know at least some of the factors that determine gender ratio by county, however the global model isn't extremely predictive: likely many local factors at play.
- My friend (from the initial analysis) should spend her time in rural areas, with young populations, vacant housing, and few renters.