Alone among all professionals, teachers are not subjected to any objective and reliable, direct or indirect, measure of performance evaluation. In every other sector, there are atleast some standard measures of evaluation, though it is a different matter that they are rarely used in any meaningful manner.
Surely, any system where people wake up to suddenly discover that a significant number of people who have been exposed to it for nine years are unable to pass the most basic of examinations and are, to be charitable, semi-literate, or a major share of its products are downright unemployable, deserves to be completely revamped.
In this context, Freakonomics points attention to an analysis of seven years of Los Angeles state standardized test-score data in Math and English from 6,000 state teachers by the L.A. Times and the Rand Corp., which finds that teacher effectiveness is three times more influential than school attendance on student performance.
The study using longitudinal student-level achievement data found that greater variations existed in the quality of teachers within each school than between schools in affluent and poorer neighborhoods. It found that highly effective teachers, the ones who consistently and dramatically raise their students' scores, are fairly evenly distributed among schools and across different levels of experience and education. Strikingly, it found that after a single year with teachers who ranked in the top 10% in effectiveness, students scored an average of 17 percentile points higher in English and 25 points higher in math than students whose teachers ranked in the bottom 10%.
Put differently, though parents obsess with picking the right school for their child, it matters far more which teacher the child gets. Yet parents have no access to objective information about individual instructors, and they often have little say in which teacher their child gets. Further, contrary to widespread belief, many of the commonly assumed factors responsible for improving teachers' effectiveness - experience, education and training - had little bearing on improving students' performance. Most interestingly, the students' race, wealth, English proficiency or previous achievement level played little role in whether their teacher was effective.
In fact, they also find that the commonest distinguishing characteristics of effective teachers were a tendency to be strict, maintainance of high standards, encouragement of critical thinking, and the engagement of his or her students. See an FAQ on the methodology adopted here.
The graphic compares the contrasting performances (in terms of raising the percentile of students able to do specified level of Math and reading) of two teachers teaching the same lessons at two different fifth grade classes at the same school
Another graphic demonstrates how the difference between the student's expected growth (each student's past test performance is used to project his performance in the future) and actual performance is the value a teacher adds or subtracts during the period. The projection based on past performance means that no teacher is hampered by the presence of low-performing students. The value-added compares students to themselves in previous years, rather than to other students with different backgrounds. For all the aforementioned reasons, this methodology can therefore be used for longitudinal tracking of students and teacher value-addition.
The study finds that many important teacher qualifications have little effect on student outcomes and "more experienced or better educated teachers are no more effective in the classroom than inexperienced teachers with only undergraduate diplomas".
It draws attention on the need to "focus on measuring teacher skills and preparation that predict subsequent teacher performance in the classroom". They write,
"Districts could consider developing policies that place importance on output measures of teacher performance. Current policies emphasize teacher qualifications that are inputs to student learning. These inputs are costly to produce and sustain in terms of hiring and salary costs, but they have little consequence on student achievement outcomes. A better approach would be to incorporate value-added measures of teacher effectiveness into teacher assessments. Teachers and administrators should have access to value-added measures of teaching effectiveness.
These measures would provide useful feedback for teachers on their performance and for administrators in comparing teacher effectiveness. Merit pay systems would realign teaching incentives by directly linking teacher pay with classroom performance. Merit pay is 'results oriented' in the sense that compensation focuses on the production of specific student outcomes. The challenge for designing a merit pay system for teachers is in defining an appropriate composite of student learning (output) and in measuring teacher performance in producing learning...
We find that teachers with better nominal teaching tools (e.g., experience, education, licensure scores) perform no better than teachers with weaker qualifications, but the current system provides little reward for better classroom performance. Perhaps teachers with extra teachings skills have too little incentive to fully utilize those skills in a compensation system that rewards their measured inputs and ignores their outputs. By realigning the incentive system and rewarding student achievement gains, we might find a different ordering of teacher effectiveness and improved overall levels of student learning."
Here are a few observations
1. The critical challenge will be in the administration of the standardized tests. How do we manage the logistics of standardized examinations, given the wide geographical spread and massive numbers of students being tested? How do we ensure the purity of both the examination invigilation and paper valuation? In other words, how do we ensure the administration of the massive exercise of standardized tests without compromising on the purity of its results?
One way would be to outsource the process itself. However, its cost and more importantly, the perception and resultant salience of an externally outsourced assessment process will amplify opposition from the unions. Administering it through internal arrangements, for example by shuffling teachers across schools, too will raise formidable administrative and supervisory challenges. However, in the initial stages, this appears to stand the best chance of success.
2. A perception that such value-addition analysis would be used to assess teachers will naturally raise political opposition from the unions. It may therefore be necessary to completely de-link its use from high stakes decisions like punishing teachers.
In fact, mere disclosure of teacher-wise value-addition for each student and the entire class, will go a long way in contributing towards increasing performance outcomes. Appropriately designed student report-cards aimed at parents, teacher-report cards aimed at administrators, and school-report cards intended for community at large, can play an important role in getting all stakeholders to respond in a manner that will nudge teachers to improving their performance.
3. In order to buy acceptance among teachers, such value-addition analysis should be spun-off as say, "teacher enhancement feedback programs". Analysis of classroom data and student learning outcomes can be used to deduce specific skill-deficiencies of teachers. This can in turn be used to objectively design training programs and impart focused trainings to teachers based on their respective deficiencies.
4. The biggest source of last-mile challenge will be in ensuring that the data collected, analyzed and presented is acted upon. It is commonplace in government to have massive data being collected and not being utilized in any meaningful manner. And the sheer volume of longitudinal data collected only increases the probability of policy-making getting buried in the small detail of numbers.
As aforementioned, this last-mile problem can be overcome with effectively designed and institutionalized policies that uses the data to simultaneously inform parents about their students' performance, administrators about the respective value-addition (and value-subtraction) of teachers and performance of schools, and teachers about where they and their students are lagging behind.
This information disseminated in the most cognitively effective manner (well designed report cards), through platforms like school management committees, and utilized to design training programs for teachers and remedial classes for students, can go a long way towards improving the quality of our education system.
Update 1 (7/9/2010)
See this collection of LA Times stories on the teacher value-addition study. And this, this, and this from NY Times.
See this website of SAS EVAAS, the most comprehensive reporting package of value-added metrics available in the educational market, which provides valuable diagnostic information about past practices and reports on students’ predicted success probabilities at numerous academic milestones.
Update 2 (23/10/2011)
A study of New York City schools by Jonah E. Rockoff and Cecilia Speroni explored the power of objective (student achievement data) and subjective (evaluations from both applicant interviews for a certification program and mentors who worked with teachers their first year) measures of teacher evaluations finds considerable merit in the later. They write,
"We find evidence that teachers who receive better subjective evaluations of teaching ability prior to hire or in their first year of teaching also produce greater gains in achievement, on average, with their future students. Consistent with prior research, our results support the idea that teachers who produce greater achievement gains in the first year of their careers also produce greater gains, on average, in future years with different students. More importantly, subjective evaluations present significant and meaningful information about a teacher’s future success in raising student achievement even conditional on objective data on first year performance. This is an especially noteworthy finding, considering that variation in subjective evaluations likely also captures facets of teaching skill that may affect outcomes not captured by standardized tests."
As Freakonomics writes, "Among the many knocks on the new push for objective evaluation measures is that they fail to capture the nuances of teaching, which the authors believe traditional subjective methods do much better."