## Value-Added Measures in Education

*27 Feb 2017 16:30*

This is an in-principle simple idea which rests on some very dubious premises: basically, that every teacher adds (or subtracts) some expected amount to all of their students' standardized test scores, and so one can figure out who the good teachers are by appropriate averaging.

The core math isn't so hard, so I'll sketch it. Start by supposing that
everything which matters about what students in school learn is captured by
standardized test scores. If we give them the same test at the beginning and
end of a school year (or end of each year, etc.), we can then measure how much
they've learned, or forgotten, by the change in their score. Call the change
for student $ i $, $ Y_i $. Inspired by ideas
from regression, we can hope to write this as a
function of various other variables plus noise:
\[
Y_i = a+m(C_i) + \epsilon_i
\]
where $a$ is the average gain over all students, the $ C_i $ are the
"covariates" associated with student $i$, and the function $m$ is supposed to
be the same across students. (*Additive* noise $ \epsilon_i $ is an
additional assumption at this stage.) We expect that some of those covariates
are attributes of the students, like their demographic characteristics,
previous test scores, etc., call them collectively $ X_i $; but we might also
think that some teachers help their students learn more than others. Say that
student $i$ is taught by teacher $t(i)$, and that teachers have multiple
students. People then make the assumption that
\[
m(C_i) = f(X_i) + V_{t(i)}
\]
where $ V_j $ is the "value" of teacher $j$, which we assume they add or
subtract to each of their students. (If we make this assumption, we can
also assume that the $ V_j $ average out to 0, since if they averaged out
to anything else we can just declare that part of the global mean $a$,
without changing anything observable.)
*If* you assume this sort of adding up, so that the final model is
\[
Y_i = f(X_i) + V_{t(i)} + \epsilon_i
\]
*and* you assume that there is no
correlation between the $ V_j $ and the attributes of the students $ X_i $, then
you could estimate the $V$'s. Here'd be one way to do it:

- Estimate $ f(X) $ by grouping together all the students with the same value of the attributes $ X $ and averaging them (by assuming no correlation, the $ V_{t(i)} $ terms just act like extra noise)
- Calculate $ Y - f(X) $ for all students
- Average $ Y - f(X) $ for all students of teacher $ j $ to get $ V_j $.

*reasons*), or try to include not just teacher-effect terms but ones for the school, etc.

I have nothing, in principle, against such hierarchical statistical models, but they rest on a whole pile of assumptions, and, to the extent that they're wrong, the results range from the meaningless to the actively misleading. (Even allowing all of the additivity assumptions, etc., just suppose that the best teachers got assigned to the hardest students.) Before relying on them for public policy, hiring and firing decisions, etc., I'd very much want to see strong evidence that the assumptions held, or came close to holding, and that (e.g.) the estimated value of a student's teacher this year didn't predict their test scores in the past. This sort of model-checking seems to be conspicuously lacking in the literature, and so my not-quite-gut (chest?) reaction to these methods is that they are more an abuse of statistical reason than an application, but I really, really ought to read much more before having a firm opinion.

See also: Causal Inference; Education and Academia; Social Science Methodology

- Recommended (inadequate placeholder until I dig through offline notebooks):
- John Ewing, "Mathematical Intimidation: Driven by the Data",
Notices of the American Mathematical Society
**58**(2011): 665--673 - Douglas N. Harris, "Value-Added Measures and the Future of Educational Accountability", Science
**333**(2011): 826--827 - Jesse Rothstein, "Revisiting the Impact of Teachers" [PDF preprint]
- Gary Rubinstein, "Analyzing Released NYC Value-Added Data" (2012), parts 1, 2, 3, 4 5 and 6

- To read [needs more pro-VA references]:
- Eva L. Baker, Paul E. Barton, Linda Darling-Hammond, Edward Haertel, Helen F. Ladd, Robert L. Linn, Diane Ravitch, Richard Rothstein, Richard J. Shavelson, and Lorrie A. Shepard, "Problems with the Use of Student Test Scores to Evaluate Teachers", Economic Policy Institute briefing paper 278 (2010) [PDF preprint]
- Douglas N. Harris, Value-Added Measures in Education: What Every Educator Needs to Know