Sunday, May 23, 2004
In an article on testing and textbooks in the current Journal of American History, Sam Wineburg reports on a history test given to a 1500 students in Texas.
Across the board, results disappointed. Students recognized 1492 but not 1776; they identified Thomas Jefferson but often confused him with Jefferson Davis; they uprooted the Articles of Confederation from the eighteenth century and plunked them down in the Confederacy; and they stared quizzically at 1846, the beginning of the U.S.-Mexico war, unaware of its place in Texas history. Nearly all students recognized Sam Houston as the father of the Texas republic but had him marching triumphantly into Mexico City, not vanquishing Antonio Lopez de Santa Anna at San Jacinto.
The overall score at the elementary level was a dismal 16 percent. In high school after a year of history instruction, students scored a shabby 22 percent, and in college after a third exposure to history, scores barely approached the halfway mark (49 percent). 1
You're probably not terribly surprised by this: it's pretty much the standard fodder for a familiar narrative of American educational failure and decline. You might be surprised to hear, however, that this test was administered not in 2004, but in 1917. According to Wineburg, student scores on history tests have remained similarly and constantly dismal across the course of the century. You could argue that the tests used to measure more difficult and sophisticated material, but the examples of student ignorance that Wineburg unearths seem to suggest otherwise. Actually, if you take into account the vastly expanded pool of children who actually attend school now, it seems pretty likely that kids these days know more about history than their counterparts did seventy years ago.
It's fun to debunk the familiar degeneration theory of American education, but actually the interesting bit of the article is about the "science" of test writing, and the ways in which that "science" predetermines the results of tests. It raises the possibility that reliance on statistical methods actually distorts results. The most glaring example of this bias is the assumption that a good question is one that replicates the outcome of the entire test. So the students who score best on the test, as a whole, should score best on each particular question. If that's not the case, then the question should be thrown out. The problem with this is that it penalizes students whose historical knowledge differs from the majority student population's. Wineburg gives a hypothetical example:
Imagine an item about the Crisis magazine, which W.E.B. DuBois edited, that is answered correctly at a higher rate by black students than by whites, while overall white students outscore blacks on the test by thirty points. The resulting correlation for the DuBois item would be zero to negative, and its chances of survival would be slim—irrespective of whether historians thought the information was essential to test.2
Which is to say, if the item was retained, it would show that white students didn't know very much about African-American history. But in practice, the question would be scrapped, and instead African-American students' historical knowledge would be slighted.
Wineburg also suggests that test-writers assume that if the overwhelming majority of students get a question either right or wrong, there's something wrong with the question. It fails to differentiate between those students who have mastered the material and those who haven't. But in fact, it could be that schools are doing either a great or a lousy job teaching that particular topic.
Now, I'm going to be honest and admit that I don't know nearly enough about statistics to evaluate whether Wineburg's criticisms make sense. But they raise some interesting questions which I haven't seen in the whole debate about testing and standards. And it may be that the most notable thing about the article is that the JAH, the scholarly journal of the Organization of American Historians, published this article at all. Historians based in universities have not paid nearly enough attention to what's going on in K-12 education. If this is evidence of a change in attitude, that can only be a good thing.