The Canadian Curmudgeon: Analyzing Some Data About the USA Congress

The Sunlight Foundation recently evaluated the speech levels of members of the USA Congress. More precisely, they measured the school grade levels of the speech. There are some interesting things that can be extracted from the data, but (of course) not what has been reported in the media.

Before doing some analysis, I think it's worth saying a few things about the sources of the data. The measurements were made using what appears (to me) to be a simplistic formula applied to an unreliable data source, and as some (whose scores were on the lower end) have pointed out, the speech level used by a speaker is not necessarily the same as their capabilities. I'll say more about all this later (near the end of this post), but for now, let's assume that the term “grade” refers to something useful.

The raw data for each Member in the 112^th Congress include, among a few other things, their name, their party affiliation (Democrat, Republican, Independent, NA), the number of years they have served in Congress, their grade level for the 112^th Congress, their grade level for their service since 1996, and their “DW1” score (this is a number between -1 and 1, supposedly representing liberal and conservative, respectively). They provide a few visualization tools, but I couldn't find anything useful, and (more significantly) I couldn't find any good analysis, so, naturally, I did my own.

The Speech Of Congress

I found lots of news articles describing some of the results of the analysis; they all state an average grade level of 10.6 for the 112^th Congress, presumably basing this on the Sunlight Foundation's own blogs. That's incorrect. The actual average¹ is 11.248. (That's a pretty fucking basic error, but let's not dwell on that.)

Here are some numbers for the 112^th USA Congress²:

Table 1: Speech Grade Level of the 112^th USA Congress
group	mean	median	stdev	skew	kurtosis
All	11.248	11.300	1.460	0.293	2.641
House	11.262	11.335	1.451	0.308	3.476
Senate	11.185	11.037	1.503	0.247	-0.470
Democrats	11.510	11.628	1.324	-0.352	0.190
Republicans	11.023	11.038	1.534	0.774	4.494
House Dems.	11.563	11.683	1.277	-0.541	0.618
House Repubs.	11.025	11.045	1.536	0.856	5.432
Senate Dems.	11.314	11.177	1.480	0.201	-0.497
Senate Repubs.	11.014	10.980	1.536	0.360	-0.284

The largest between-group difference in means is between House Democrats (11.563±1.277) and Senate Republicans (11.014±1.536). This is a Mahalanobis distance of 0.389; in other words, they differ by slightly more than one third of one standard deviation. The main thing to get out of this is that, even though the population means are a bit different, there is a very large overlap between Democrats and Republicans (see graphs and analysis below); if all you know about a Member of Congress is their party affiliation, you can't predict their grade level to any useful extent, nor can you predict party affiliation from grade level.

Let's dig a bit deeper. There are two (perhaps surprising) differences:

The skewness in the House of Representatives is large, and in opposite directions for the two parties. (This is not the same as saying “The House of Representatives is skewed” — skewness is a statistical term.)
The Republicans in the House of Representatives are hugely kurtosified. (I made up that word; kurtosis is also a statistical term.)

It's helpful to draw some graphs³. Normally these would be drawn as histograms, but I think you can more easily see some things if they're drawn as (estimated) probability density functions (a.k.a. probability distributions) instead. (You can get a density function by dividing the numbers in the histogram by the width of the bins and by the number of items in the sample, but I've gone beyond that and smoothed out the graph somewhat.) In addition to the density functions, I'm plotting an “error bar” (at the bottom of the graph) which is the mean plus and minus one standard deviation (so it includes roughly 70% of the data).

Here's the entire Congress:

Figure 1: Probability Distribution of Grade Levels of the 112^th Congress

(The outlier is Rep. Dan Lungren (R), CA, whose grade is 20.466; his grade over his seven years of service is 16.014.)

Here are the two chambers, the House and Senate:

Figure 2: Probability Distributions of Grade Levels of the 112^th Congress by Chamber: House (H) and Senate (S)

Here are the two parties⁴, Democrats and Republicans:

Figure 3: Probability Distributions of Grade Levels of the 112^th Congress by Party: Democrats (D) and Republicans (R)

As I said previously, there's a small difference in mean, but lots of overlap. A useful measure with which to compare probability distributions⁵ is the “probability of confusion”; this is twice the expected misclassification rate (if the classifier algorithm is simply “pick the one with the larger value”). For a coin-toss classifier, twice the expected misclassification rate is 100%; for the grades of the Democrats and Republicans, the probability of confusion is 83%, not much better than a coin. (You can do slightly better with a Bayes classifier, because you know the number of Democrats and Republicans, but not a whole lot better; the priors are 45% and 55% respectively.)

Here are the House, Senate, Democrats, and Republicans, separately:

Figure 4: Probability Distributions of Grade Levels of the 112^th Congress by Party and Chamber: House Dems. (HD), House Repubs. (HR), Senate Dems. (SD), Senate Repubs. (SR)

In addition to comparing the curves to each other as in Figure 4, it is illustrative to compare them to Gaussian distributions with the same mean and standard deviation. (Sorry, Mr. Lungren, I changed the scale so that we can see things more easily, and you dropped off the edge of the graph.)

Figure 5: Probability Distributions of Grade Levels of the 112^th Congress by Party and Chamber, compared to Gaussian distributions

There are a few things I think worth noting:

The Senate Democrats (SD), unlike the other three groups, are surprisingly close to uniformly distributed⁶.
There's a noticeable gap of Republican Senators (SR) who are just a bit above average, and an excess substantially above average; it's like they studied extra hard or something.
On the other hand, it seems like the House Republicans (HR) are throwing out anyone more than one standard deviation above average.
And it's like the House Democrats are throwing out anyone just slightly below average.
Hmm… maybe the Senate Democrats are just throwing out anyone average.

A Bit of Multi-Variate Analysis

Consider the following eight attributes of the Members of Congress (all of these from the same dataset):

Y = Years in Congress
P = Party (Democrat or Republican)
C = Chamber (House or Senate)
G = F-K Grade in the 112^th Congress
G_a = F-K Grade throughout membership (back to 1996)
E = East (longitude of the Member's state)
N = North (latitude of the Member's state)
D = DW1 score

Here are the strongest correlations among these variables:

95% between P and D (Party and DW1 score)⁷. We'll see that in the colour-coding in Figure 7 below.
79% between G and G_a. One might be surprised that this is so low; however, rather than suggesting any sort of change over time, this is merely an artefact of (chronologically reversed) regression towards the mean⁸.
-31% between Y and DW1. If one were to naively deduce a (weak) causative relationship, it would be either that liberalism causes longer terms, or that time in congress causes liberalism. The reality, though, is much less interesting: lots of so-called “freshman” members (those serving their first term) are Republican with a positive DW1 score, and this is contributing greatly to the -31%. I'll say a bit more about that below.
-27% between G_a and DW1.
26% between Y and G_a.
-26% between Y and P.

The correlations continue slowly decreasing from there. Principal components analysis shouldn't really be applied to these variables (they are incompatible in terms of “units”), so I've normalized them (to zero mean and unit standard deviation) and then applied principal components. The eigenvalues are (2.49, 1.52, 1.06, 1.02, 0.871, .0792, 0.198, 0.0460); there's nothing really notable there, it's just a big blob. Here's the graph showing the -31% correlation between Y and DW1:

Figure 6: DW1 Score (DW1) and Years of Service (Y) of the 112^th Congress

The solid dense line in the top left of this graph is the freshman Republicans. What if we remove the freshmen? The correlation coefficient is then -17%, and it hovers around there as we remove a few more of the earliest years. But, there's clearly a bimodal distribution, so extracting anything useful will require more sophisticated tools than correlations and regression (and, likely, more extensive data than are in this dataset).

The Conservative Bias

Figure 6 shows a nice bimodal distribution in the DW1 score. Let's take a closer look (recall: -1 is liberal, 1 is conservative) at how that relates to Flesch-Kincaid Grade Level:

Figure 7: Grade Level (G) and Liberal/Conservative Score (DW1) for the 112^th Congress

Just from looking at this plot, you can see that, whatever linear regression model you find, it won't be very useful. This didn't stop the Sunlight Foundation from doing this; indeed, they went beyond that (Fig. 2 in their blog), fitting what appears to be at least a fourth-order polynomial to each half of the data (Democrat and Republican). Either that, or they used some bullshit neural net; I'm not sure which is stupider. (I think their Fig. 2 shows the collective data from 1996, not just the data from the 112^th Congress as the caption suggests.) The linear fit to the Democrats' data is G = 11.629 + 0.293 DW1, R²=0.0009, and the linear fit to the Republicans' data is G = 12.399 − 2.904 DW1, R²=0.09. (A small R² means that most of the variability in the data is due to stuff you haven't included in your model, or to noise. In other words, these linear models are almost useless.) The linear fit to the entire data set is G = 11.294 −0.673 DW1, R²=0.046 (also useless).

There is, though, something hidden in the data; specifically, in the DW1 data all by itself. Let's do another set of probability density functions:

Figure 8: Probability Distributions of DW1 score of 112^th Congress by Party and Chamber: House Dems. (HD), House Repubs. (HR), Senate Dems. (SD), Senate Repubs. (SR)

At first glance, it might appear that the House Democrats are radically liberal. However, a closer inspection shows that it is not the House Democrats that are shifted to the left, but rather, the Senate Democrats are shifted to the right (and so are the House Democrats).

Here are some more numbers:

Table 2: DW1 scores of the 112^th USA Congress
group	mean	median	stdev	skew	kurtosis
All	0.070	0.255	0.463	-0.083	-1.573
House	0.080	0.274	0.474	-0.158	-1.592
Senate	0.044	-0.160	0.412	0.269	-1.423
Democrats	-0.407	-0.404	0.133	0.044	-0.169
Republicans	0.473	0.469	0.160	0.428	0.465
House Dems.	-0.429	-0.437	0.132	0.169	-0.166
House Repubs.	0.479	0.475	0.157	0.378	0.251
Senate Dems.	-0.326	-0.340	0.100	0.454	0.245
Senate Repubs.	0.445	0.415	0.175	0.744	1.670

There are really just a few things that need to be said:

There is a conservative bias in the USA Congress.

The average DW1 is positive.
The most liberal Senator has a score of -0.494. That's only half-way liberal. The most conservative Senator has a score of 0.941. The most liberal House member has a score of -0.744. The most conservative House member has a score of 0.988.
Of the four (HD, HR, SD, SR), HD has the most liberal average, -0.429. Notice that 0.429 is less than 0.445; both of the Republican groups are more extreme than the most extreme of the Democrat groups.

The Senate Democrats need to work harder.

Seriously; an average of -0.326? Is that the best they can do?
And a standard deviation of 0.100? Get some fucking diversity!

John Boehner's DW1 score is "NA" (I suppose he blew the fucking meter) so I had to throw him out of my analysis. Too bad I can't also throw him out of Congress.

Regarding The Data

So, what is this Flesch-Kincaid Grade Level, anyway? It's an extremely simplistic formula, without any indication of confidence levels (as far as I could find with minimal effort). It's defined by the average number of words per sentence (WPS) and the average number of syllables per word (SPW); the formula is G = 0.39×WPS + 11.8×SPW − 15.59. In other words, longer sentences and longer words are positive. (The previous sentence, for example, scores G=8.37.)

Is this measuring anything useful? I'll say a few things, without doing any work to justify them.

I can believe that there is a statistical correlation between G and the level of education one attains; after all, you tend to learn more words as you get older, and you tend to learn faster in school (where you're forced) than out, so time in school generates a larger vocabulary, and a larger vocabulary contains relatively more polysyllabic words (you quickly run out of monosyllabic words). Spending time in school increases your average syllables per word. Similarly, you learn more complicated grammatical constructions, allowing you to form longer sentences. In other words, I can believe that there is a causative relation: time in school tends to increase one's Flesch-Kincaid Grade Level.

However, there are other factors which go into the F-K G. You can, without much effort, convert a lengthy passage of text into one very long sentence, merely by the appropriate use of semicolons, “and”s, etc. With a tiny bit more work (like moving phrases around), you can make it look like that's not what you did. Similarly, you can replace words with lengthier equivalents. (See, I just gained a syllable by using “lengthier” rather than “longer”.) There are, as statisticians would say, confounding variables. That's why it would be useful to have some measure of confidence. (Of course, a confidence measure implies there's something real being measured, as opposed to some bullshit that somebody made up to get some Navy funding, but whatever.)

One must always remember, though, that the audience is important (if you want to be listened to). Longer sentences, when printed, can be dissected (there's a whole calculus behind this) to make parsing easier. When spoken, though, this can be somewhat more difficult (how does one pronounce “;”?) and error-prone. Shorter sentences can be better. (Short is best.) The use of a polysyllabic vocabulary (big words) is a somewhat easier issue: you want to use words that your audience will understand, regardless of the sentence length. I sometimes use quite big words that only a few hundred people understand; that's because I'm a scientist, and we invent (or re-use) words to capture very complicated but precise ideas that we use all the time. (One such word is “kurtosis”, but more than a few hundred people understand it.)

I don't think the Flesch-Kincaid Grade Level is particularly relevant to gaining any insight into the USA Congress.

However, there's a more fundamental issue with the data: the Flesch-Kincaid Grade Levels were calculated by analyzing the written Congressional record. The Congressional record can be, and very frequently is, edited. Sometimes entire speeches are synthesized and inserted into the record without having ever been spoken. And you know it's not the Members of Congress doing that editing — it's their aides and speechwriters. I don't know what portion of the Congressional Record is “real” and what is edited, but it is definitely contaminated, so applying any sort of analysis to it is questionable from the start.

Finally, as I mentioned, John Boehner's DW1 score is "NA". There are three members whose parties are "NA". Their parties can all be found on the House of Representatives web page with (literally) just a few seconds' work (I searched for “list of house of representatives members”). You can get the DW1 score from voteview.com (I couldn't find Boehner on there, though — maybe he really did break the meter).

I suppose the Sunlight Foundation is doing good stuff (publicizing government activities), but if they're going to release stuff and pretend it's even semi-scientific, they should get it semi-peer reviewed, and they should spend a minute validating their data. I give them a grade of D for their work on these speech grade levels (obvious basic arithmetic errors, failure to validate data, quoting statistical measures without confidence levels or similar measures like R², stating unjustified conclusions).

Regarding the representatives who said that their speech level is not the same as their capability, I agree, in theory. However, let's look a bit closer at this. Rep. Rick Mulvaney (R-SC, Grade for 112^th Congress 7.95, DW1 score 0.819), with the distinction of having the lowest speech grade, said

I see it as an affirmation that we're doing something right … You've got to speak clearly and concisely [if you want people to know what you believe] … This is a group of people who are trying to sound like ordinary people and not like politicians.

That's a big part of the problem right there: in order to sound like “ordinary people”, you need to speak (in this case) like an eighth-grader. The “ordinary people” who elected Mulvaney and his fellow freshmen (the so-called “Tea Party”) think at an eighth-grade level (there are people in SC who can think better than that, but they probably didn't vote for Mulvaney). The real world is complicated and cannot be understood clearly and concisely; if your “belief”, or understanding of the world, is concise, then you're naive and wrong. The only sense in which they're “doing something right” is that they are accomplishing what their backers are trying to accomplish, namely, the removal of government regulation (except for the regulations that restrict the rights of women, non-Evangelical Christians, homosexuals, and any others who aren't like them or what they're pretending to be) and the destruction of a functional federal government, creating the next version of the USA: a repressive plutocratic theocracy.

^The fact that this incorrect number has been spewed all over the common media says a few things, including: none of the so-called “journalists” bothered to fact-check, and they barely deserve the title “reporter”; these same reporters are stupid enough to treat a blog as a legitimate source of information; and, this is why peer-review is important. A few days ago, someone posted a comment on one of the Sunlight Foundation's blogs, pointing out this fundamental error in arithmetic; however, nobody +1'd it, or voted it up, or whatever stupid shit they do on blogs, so it's buried down below all the comments about “Tea-Party!” and “Stoopid Congress” and a few comments about the vacuity of the more complicated statistical “analysis” the SF wrote about.

^For those who aren't familiar with the USA government, there are three branches: the Executive (president and cabinet, a.k.a. “The White House”), the Congress, and the Judiciary. The Congress consists of two parts, the Senate and the House of Representatives. That seems odd to me, because the Senators should be representing the people, too, but nobody asked me. (Even worse terminologically, the Senate and House are sometimes referred to as the upper and lower houses. I could probably make a joke here about the authors using hemp for something in addition to the paper on which they were writing the constitution.)

^Some of you might be squawking “Where are your axis labels?!?!?” These being probability distributions of some variable, the horizontal axis is obviously the variable (I've labelled the scale), and the vertical axis is obviously the density. Being a density, the vertical scale is determined by the property that the function has integral equal to one, and that's all you really need to know. (The only relevant information is obtained by comparing areas under the curve.)

^I'm using the USA colour convention for political parties. The civilized world uses red to represent socialism. The USAians use it to represent Republicans, even though it's mostly Republicans who say “Better dead than red.” I think that says something about them, but I'm not sure what.

^Two other commonly employed measures comparing probability density functions are the Kolmogorov-Smirnov statistic (Kuiper variant, D=17%, Q=1%) and the Kullback-Leibler divergence (0.197). These say that, indeed, the density functions are different, but there is substantial overlap.

^The Kuiper variant of the Kolmogorov-Smirnov statistic applied to the middle 45 Senate Democrats (throwing out 3 outliers on either end) yields D=8.5%, Q=99.994% compared to a uniform distribution. If Q=100% then the distributions are identical.

^If you assign to P the values 0 and 1 representing Democrat and Republican, you can compute the correlation coefficient of 90%. (There are five members in the data set whose party is neither Democrat nor Republican, so discarding them doesn't change much.) The actual numbers you use aren't relevant; at most they'll change the sign of the coefficient.

^I verified this empirically. It's easier to explain in code than in words, so, here (octave code): X = zeros(1,530); Xa = zeros(1,530); for i=1:530, g0 = 1.1998*randn+11.429; g = 1.1998*randn(1,Y(i))+g0; end; Xa(i) = mean(g); X(i) = g(1); end; Then the correlation coefficient cov(X,Xa)/std(X)/std(Xa) of Xa with X is 0.813±0.014. X and Xa are simulations of G and G_a; 11.429 and 1.1998 are the mean and standard deviation of the G_a. (You won't be able to run this code unless you have the values of Y.) A few variations of this experiment yielded similar results.

1 comment:

Medina6430 May 2012, 00:16:00
I like this - it is a good example of how pseudo-measurements give pseudo-information. I'm just glad it didn't show the Reps having a much higher grade level than the Dems - that is all we would be hearing about. It would be interesting to do this analysis on columnists from both sides - George Fucking Will would probably score 30, and still be full of crap.

Commenting might not work. You can try and see what happens, who knows, it might work. (It'll show a message “Your comment was published” if it worked.) If it didn't work, try hitting the “Post Comment” button again. Still didn't work? Hit it harder this time! (Seriously. It seems to work the third time, and then always after that, unless you clear some browsing data. I'm trying to fix it.)

The Canadian Curmudgeon

Please get off my lawn, eh?

2012-05-28

Analyzing Some Data About the USA Congress

The Speech Of Congress

A Bit of Multi-Variate Analysis

The Conservative Bias

Regarding The Data

1 comment: