Are Student Teaching Evaluations Reliable?

dowdenr | October 11, 2018

By Dr. Hassan Shirvani–Opinions vary widely on the worth of student evaluations of their teachers. To some, they are useful tools for providing faculties with timely student feedbacks, for assisting students with their course selections, and for helping administrators with their personnel decisions. To others, they are unreliable and biased instruments for eliciting teaching evaluations from the students that are too immature, inexperienced, or capricious to meaningfully pass judgment on the quality of their teachers. In addition, critics of teaching evaluations are also concerned that such evaluations can result in popularity contests among faculty, thus giving rise to grade inflations and deteriorating academic standards. Finally, some detractors of teaching evaluations assert that student views of their teachers are often tainted by a host of ethnic, gender, racial, and religious biases. While there are some legitimate grounds for the views expressed on both sides of this debate, it is a safe bet that the teachers with positive evaluations generally support these evaluations, while those with less favorable reviews often find them wanting.

The Evidence

Even though there is little hope of ever empirically settling these opposing views, there seems to be some evidence that, if anything, students are remarkably accurate and consistent in their teaching evaluations. For example, student ratings of the same teacher in the same semester across different courses, or different sections of the same course, correlate quite highly: 0.95 for a class size of about 50 students. Such a high correlation indicates that an overwhelmingly large majority of students share essentially the same view of their teachers, for better or for worse. Similarly, the ratings of the same teacher correlate quite highly over time, 0.82, again implying great consistency in teacher evaluations by successive classes. Finally, a high positive correlation, 0.77, is also reported between the results of current student surveys and the student surveys conducted 10 to 20 years after graduation. That is, students generally fail to significantly revise their ratings of their teachers with greater age, experience, or maturity.

On the other hand, regarding the issue of popularity contests among teachers, the available evidence again offers grounds for reassurance. For example, the average correlation between student evaluation scores and their “expected grades” has consistently been found to be quite low, about 0.20. Thus, even if students do tend to appreciate “easier course requirements” and “more lenient grading policies,” they seem to be mature enough not to mistake these ingratiating gestures for good teaching. In addition, this positive correlation, small as it is, may simply indicate that students genuinely expect to do better (worse) in classes they seem to like more (less).

Finally, and regardless of how students feel about their teachers, there is also significant empirical evidence that there indeed is a relatively high positive correlation, 0.50, between the student teaching evaluation scores and the student learning outcomes, as measured by student grades in a host of common tests. That is, the evidence suggests that students generally learn more form the teachers that they tend to rate higher. In addition, there is also substantial evidence that, whatever our qualms about their accuracy, there is presently no other measure of teaching effectiveness that correlates as highly with subsequent student learning performances as student teaching evaluations. In other words, while student teaching evaluations may be imperfect, they are still the best available predictors of how well students will subsequently do academically.

Peer Reviews?

There are, of course, those who recommend peer reviews as a better source of informed judgment about the quality of teaching. According to this view, those with many years of teaching and scholarship records are far better positioned to determine whether their colleagues have the necessary depth and breadth of knowledge or the requisite class management skills to be effective teachers. There is certainly much to be said for such a view. However, those universities which have used this approach often report high correlations, 0.65, between their peer and student evaluations, indicating that using both approaches may prove redundant. In addition, peer reviews are often problematic through such complications as lack of familiarity with out-of-field subjects, faculty biases towards each other, and inter-departmental rivalries. In such situations, it probably makes more sense for department Chairs to review instruction to ensure sufficiency in areas in which students may lag in their evaluations, such as course contents and levels of coverage.

Conclusion

As the foregoing makes it clear, student teaching evaluations, despite all their shortcomings, are still the best available tools of assessing teaching effectiveness. If used properly as a part of a more comprehensive assessment mechanism, they can be quite informative and useful. A refusal to offer these evaluations, or even to constantly keep denigrating their relevance and worth, will simply send the unintended message to students and their parents that colleges have little interest in documenting the quality of their instructions through the testimonies of those who are perhaps among the best-placed to pass judgment on this issue. Indeed, being cognizant of this fact, many prestigious colleges routinely provide their students with easy access to all previous teaching evaluations of their faculties. Such evaluations are even available for the tenured faculties with sterling scholarly reputations on the grounds that good scholarship may not necessarily beget good teaching. These colleges are therefore sending the clear message that they consider regular feedbacks from their students to be a vital part of their continued search for teaching excellence.

Hassan Shirvani, Ph.D.
Professor Cullen Foundation Chair in Economics

See more posts by this author

References:

Barre, E. “Research on Student Ratings Continues to Evolve. We Should, Too.” Rice University, Center for Teaching Excellence.(2018).

Hattie, J. and H. W. Marsh. “The Relationship between Research and Teaching: A Meta-Analysis.” (1996).

Marsh, Herbert W. “Students’ Evaluations of University Teaching: Research Findings, Methodological Issues, and Directions for Future Research,” (1987).

share this post

View all posts filed to:
CSB Experience, CSB Faculty

See posts with a similar topic:
Hassan Shirvani, Teacher Evaluation, Teaching Effectiveness