Another Independent Study Casts Doubt on Value-Added Models of Teacher Evaluation–Specifically Including the Houston ISD Version

Policy-makers may be drawn to the simplicity of reducing teacher evaluation to a “value-added” score based on achievement tests, but they are neglecting an expanding body of educational research that shows this seeming simplicity comes at the expense of accuracy.

Educational historian Diane Ravitch, in her latest “Bridging Differences” blog entry (, cites a new study from the Annenberg Institute for School Reform at Brown University that finds value-added assessments of teacher effectiveness are at best a “crude indicator” of teachers’ contributions to students’ academic outcomes. Author Sean Corcoran, a professor of educational economics at New York University, concludes: “The promise that value-added systems can provide a precise, meaningful and comprehensive picture is much overblown. Teachers, policy-makers and school leaders should not be seduced by the elegant simplicity of value-added measures. Policy-makers, in particular, should be fully aware of their limitations and consider whether their minimal benefits outweigh their cost.”

Corcoran notes that, in theory, value-added models attempt to define a teacher’s unique contribution to students’ achievement that cannot be attributed to any other current or past student, family, teacher, school, peer, or community influence. However, Corcoran points out, in practice it is exceptionally difficult to isolate a teacher’s unique effect on academic achievement.

“The successful use of value-added requires a high level of confidence in the attribution of achievement gains to specific teachers,” he says. “Given one year of test scores, it’s impossible to distinguish between the teacher’s effect and other classroom-specific factors. Over many years, the effects of other factors average out, making it easier to infer a teacher’s impact. But this is little comfort to a teacher or school leader searching for actionable information today.”

Corcoran’s research fortifies an already-strong academic consensus, described as follows in an October 2009 statement of the National Research Council of the National Academy of Sciences on the proposed use of assessment systems that link student achievement to teachers in the federal Race to the Top initiative: “too little research has been done on these methods’ validity to base high-stakes decisions about teachers on them.”

Corcoran analyzes in particular the value-added systems used in New York City’s Teacher Data Reports and in Houston ISD’s ASPIRE program (Accelerating Student Progress, Increasing Results and Expectations). He concludes that the standardized tests used to support these systems are inappropriate for value-added measurement.

“Value-added assessment works best when students are able to receive a single numeric test score every year on a continuous developmental scale,” states Corcoran, meaning that the scale does not depend on grade-specific content but rather progresses across grade levels. Neither the Texas nor New York state test was designed on such a scale. Moreover, the set of skills and subjects that can be adequately assessed in this way is remarkably small, he argues, suggesting that value-added systems will ignore much of the work teachers do.

“Not all subjects are or can be tested, and even within tested subject areas, only certain skills readily conform to standardized testing,” he says. “Despite that, value-added measures depend exclusively on such tests. State tests are often predictable in both content and format, and value-added rankings will tend to reward those who take the time to master the predictability of the test.”

But the biggest problem with use of value-added assessments to evaluate teachers is their high level of imprecision, he argues. “A teacher ranked in the 43rd percentile on New York City’s Teacher Data Report may have a range of possible rankings from the 15th to the 71st percentile after taking statistical uncertainty into account,” says Corcoran. He finds that the majority of teachers in New York City’s Teacher Data Reports cannot be statistically distinguished from the 60 percent or more of other teachers in the district.

“With this level of uncertainty,” Corcoran adds, “one cannot differentiate between below average, average, and above average teachers with confidence. At the end of the day, it’s isn’t clear what teachers and their principals are supposed to do with this information.”

To see the entire report, visit