The best thing about having a relatively high-traffic blog, is that I can just mention analysis I’d like to see and then more often than not someone smarter than me goes and does it. The other day, I was wondering what would happen if you analyzed county-level population density in terms of political behavior. And now Dave Shor’s gone and done the math. Here’s one chart:

That’s the result of “a regression on 948 counties in 10 states to model Kerry’s two-way share of the vote” and it’s not just a line on a page, it’s a statistically significant correlation. But it’s not a huge effect: “if a SimCity-like God multiplies a county’s population by 3, Kerry’s county-level share of the vote would increase by about six tenths of a point.” Then there’s this:

It turns out that “overall density isn’t really as important as relative density within the state” and so “by looking at a counties voting population as a fraction of the overall electorate of the state, we see that even after accounting for overall density, it’s better to be a democratic candidate in the densest city in Iowa then to in the second most dense city in New York.” Interesting, and not really what I expected.
October 17th, 2008 at 11:24 am
Once again, MY has no idea what to do with a chart. Look at the top one. The linear fit shows Density as the dependent variable, and Kerry share as independent — so increasing the Kerry share by one point nets an increase in density of 0.0569 units of density. Admittedly, this is totally backwards, and the lack of density units is inexcusable.
Flip the axes and invert the equation to see that the coefficient is 17.6 points per unit density. So tripling the density adds 53 points to Kerry’s tally. To confirm, look at the red line — it goes from coordinates (5%, 2) to (85%, 6.5) so multiplying the density by 3.5 nets an increase of 80%.
October 17th, 2008 at 11:27 am
Looking at his webpage, he doesn’t have a very comprehensive set of control variables so I wouldn’t necessarily take this as the gospel. Age and education are obvious omissions since these tend to be correlated with both urban living and Democratic voting. Besides, a better design would use repeated observations over time so we could control for unobserved factors that influence particular locations. (Yes, I’m complaining about the quality of the free ice cream.)
If we take these results at face value I think they imply that people choose where to live in a state partly on the basis of their preference for “being left alone.” People who don’t want to live close to lots of other people choose low density areas. These people are also attracted to Republican “less government” appeals. So people in less dense areas of a dense state have chosen to live where they do because they want to be left alone and are therefore more likely to vote Republican.
I think Matt is disappointed because he hoped that density would _cause_ Democratic voting, perhaps as people were forced to interact and became more civic-minded. Instead it appears that choosing to live in a city simply reveals underlying preferences that are also reflected in voting.
October 17th, 2008 at 11:30 am
…multiplying the density by 3.25…gack.
October 17th, 2008 at 11:50 am
Further to DCreader’s point @ 2, you could imagine a converse “leave me alone” effect reinforcing this from the Democratic side. Imagine that you’re openly gay, atheist, liberal or whatever: are you going to choose to live in your state’s most or least densely populated area? Where do you stand a better chance of finding a community that accepts you as you are?
October 17th, 2008 at 12:18 pm
DCReader,
I started with a bunch of demographic variables, but eliminated the ones that were not significant.
October 17th, 2008 at 12:20 pm
I suspect that the plot is actually showing the log of population, rather than density. That would explain the within-state improved correlation, since counties within a state are much more uniform in size.
October 17th, 2008 at 12:24 pm
Matt B,
Don’t be so quick to critique Matt’s graph reading skills. If you look at the details, you’d see that the the graph shows Log(Density) and Log(VoteShare)
October 17th, 2008 at 12:44 pm
David, as the first commenter notes, your axes are inverted on the graph here. Also, whats with the dummy variables with the separate states?
October 17th, 2008 at 1:00 pm
“Flip the axes and invert the equation”
CAUTION: you can’t just flip the axes to find what the regression line would be if you fit it in the other direction. In general doing so would give you a line that would appear much too steep. Consult an econometrics text for details (look for “reverse regression.”)
Also, this is definitely an example of where CORRELATION DOES NOT NECESSARILY IMPLY CAUSATION.
October 17th, 2008 at 1:14 pm
BobbyPop,
Just noticed the graph. The axe’s flipped for some reason, I’ll just make some conditional expectation graphs with Stata.
As for the state dummies: The ones I used are pretty significant, no?
October 17th, 2008 at 1:39 pm
Alright, new graphs posted.
October 17th, 2008 at 4:34 pm
David,
Yep, significant allright, but just curious about why you put them in as well as their effect sizes. I get the overall point, I’m just a geeky stats guy.
October 17th, 2008 at 7:10 pm
Is the density axis in common logs or natural logs?
October 17th, 2008 at 7:24 pm
BobbyPop,
Most likely too late to answer your question, but: I was going to leave it out, but the Texas dummy was so monstrously significant that I couldn’t ignore it.
This is mainly because of the huge number of counties that Texas has(it goes beyond their size, I think one of their counties had 56 people).
And once I added Texas, it seemed arbitrary to not include other statistically significant state dummies. All together, only Florida, Texas, and Colorado passed that test.
Thanks for the question, and sorry for the delay. Of course, comments left on my website itself would be answered sooner.
October 17th, 2008 at 10:26 pm
More totally incomprehensible graphs. From the text of the article: “county-level population density” vs. “political behavior”. Actual x axis label: “KerryTwo”. Actual y axis label: “l_density”. With some (unitless?) numbers. Nowhere is there a hint at what is actually being discussed. What the hell are we even talking about here?
Tufte offers a course in the graphical presentation of information… which I think both MY and the author of the source article should take.
October 18th, 2008 at 1:55 am
Correlation doesn’t imply causation. A study like this isn’t intended to answer a definitive question, but to help us determine what’s worth studying further. This study doesn’t provied us the answers, it provides us the questions.
I’ve a heard a few good ideas already. Run the numbers on those, see if they pan out. Rinse and repeat until we know what’s worth an emprical study — the real source of answers.
March 11th, 2009 at 8:55 am
I bookmarked this site. Thank you for good job!
March 13th, 2009 at 5:05 am
Very interesting site. Hope it will always be alive!
March 14th, 2009 at 9:32 am
It is the coolest site,keep so!
xanax
March 17th, 2009 at 3:27 am
It is the coolest site,keep so!
tramadol
April 2nd, 2009 at 9:51 am
It is the coolest site,keep so!
buy cheap viagra
April 8th, 2009 at 3:49 am
thanks !! very helpful post!
viagra
April 16th, 2009 at 4:03 pm
Incredible site!
viagra