Evaluating “Value-Added” Measurement of Teacher Effectiveness: Not Just a Houston Problem

A public forum sponsored by the Houston Federation of Teachers last week shed some needed light on the frailty of the value-added methodology in use in Houston ISD. But many issues raised at the forum pertain to all value-added methodologies currently in use to measure teachers’ effects on student learning, not just to the Education Value-Added Assessment System (EVAAS) adopted in Houston.

More than 200 attendees, including teachers, parents, and other members of the Houston ISD community, heard a detailed critique of the EVAAS model from Dr. Audrey Beardsley of Arizona State University. Beardsley noted the intuitive appeal of measuring educational gains of students who spend time in a teacher’s classroom, versus the federal “adequate yearly progress” model that measures outcomes for different cohorts of students from one year to the next without regard to where those students started academically.

The EVAAS model, developed by agricultural statistician William Sanders, purports to provide an accurate measure of the amount of student academic growth from year to year that can be attributed to their individual teachers. The claims made for this model have led to its purchase and adoption in several states and localities, not just in Houston. However, Beardsley said, the accuracy of Sanders’ model cannot be independently verified by educational researchers because Sanders refuses to reveal the crucial inner workings of his statistical techniques. Sanders insists that this information is a business secret—proprietary information that he cannot disclose without jeopardizing his company’s profits. But the secrecy surrounding his techniques means that the system Houston ISD has embraced for teacher evaluation has not been checked out by scientific peer review. Moreover, Beardsley noted, her review of what is publicly known about Sanders’ work has turned up significant technical errors.

The Sanders methodology, despite its secrecy, also partakes of well-documented drawbacks common to all current value-added models. As one study after another has shown, these models don’t clearly isolate the effects of teaching from the effects of other variables both in school and out of school on student performance, including students’ backgrounds, school supports, and summer learning loss. And they share another fundamental flaw, because they falsely assume that students are assigned randomly to teachers’ classrooms. As a result of all these flaws, value-added scores tend to fluctuate dramatically from one year to the next, often doing the greatest injustice to teachers who teach students with the greatest needs.

AFT President Randi Weingarten also addressed the Houston forum, moving beyond the critique of EVAAS to offer some sensible suggestions for real improvements in teacher evaluation. We need a new model of teacher evaluation to replace superficial, “drive-by” evaluation, she said, but it needs to be part of a systemic approach to improving teacher development—i.e., improving teaching effectiveness. She used a sports analogy, noting that professional athletes spend huge amounts of “practice time” as opposed to “game time,” while the ratio is the reverse for teachers. The goal should be continuous improvement for every teacher, Weingarten said. Continuing the sports analogy, she said teachers need time to review the “game tape” to figure out what is working and what is not. And the gauge of what is working cannot just be scores on standardized tests, if we want to get a true, rounded picture of teaching effectiveness. To drive home the point, she quoted one of AFT’s earliest members, Albert Einstein:  “Not everything that counts can be counted, and not everything that is counted counts.”

Citing successful collaborations between local AFT affiliates, parent and community representatives, and district administrators on improved systems of teacher evaluation and training around the nation, Weingarten urged similar collaboration in Houston. The lively question-and-answer session that ensued suggested that she had struck a chord with her audience.