Are There Gender Differences in Quantitative Student Evaluations of Instructors?

Zipser, Nina; Mincieli, Lisa; Kurochkin, Dmitry

doi:10.1007/s11162-021-09628-w

Are There Gender Differences in Quantitative Student Evaluations of Instructors?

Published: 05 March 2021

Volume 62, pages 976–997, (2021)
Cite this article

Research in Higher Education Aims and scope Submit manuscript

1129 Accesses
5 Citations
9 Altmetric
1 Mention
Explore all metrics

Abstract

Recent research conducted at numerous universities has found evidence of instructor-gender differences in student evaluations of teaching (SET). This paper examines whether such gender effects exist in “instructor overall” ratings within a database of SET that includes almost 600,000 observations from the past 11 years for the Faculty of Arts and Sciences (FAS) at a large research university in the northeastern United States. First, using multivariate OLS regression analysis, we tested 32 hypotheses of gender differences within discipline-rank combinations. Of the 32, only two hypothesis tests showed statistically significant gender differences in the instructor overall rating; one discipline-rank combination had higher average scores for male instructors, and one discipline-rank combination had higher average scores for female instructors. Second, using quasi-experimental data from calculus courses, we found that mean instructor overall scores of female instructors were different from those of male instructors only for Teaching Assistants (TAs) and Teaching Fellows (TFs), with higher scores for female TAs and TFs. Overall, we find no evidence of systematic gender differences in our analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

What’s in a Name: Exposing Gender Bias in Student Ratings of Teaching

Article 05 December 2014

Gender-biased evaluation or actual differences? Fairness in the evaluation of faculty teaching

Article 13 August 2021

Do Students Discriminate? Exploring Differentials by Race and Sex in Class Enrollments and Student Ratings of Instructors

Article 06 October 2020

Notes

The instructor overall rating asks students to “Evaluate your instructor overall,” with a scale of 1 = unsatisfactory, 2 = fair, 3 = good, 4 = very good, 5 = excellent.
For a summary table of recent studies examining gender bias in teaching evaluations, please see Table 4 in Appendix 1.
Due to confidentiality concerns, the data used for the observational and quasi-experimental analyses are not publicly available. However, aggregated data and code are available upon request.
Non-ladder/other instructors include instructors on term appointments who are not on a tenure track. This category also includes visiting faculty, staff teaching appointments, TA/TFs acting as instructors, etc. Non-FAS instructors are instructors who have a teaching appointment (either permanent or temporary) at the institution, but not within the Faculty of Arts and Sciences.
For a definition of each independent variable included in our model, please see Table 5 in Appendix 2.
In Table 6 in Appendix 3, we report all of the interactive model’s estimated coefficients. We also run a version of the model with age and age-squared interacted with gender. The results of this model are similar to our final model, and are available upon request.
See Table 6 in Appendix 3 for the OLS model.
An equivalent way of looking at this would be to enforce the Bonferroni correction which results in a 5%/32 = 0.16% individual test significance level (assuming 5% family-wise error rate threshold) and therefore non-significant results in the average scores of male and female instructors for all disciplines and rank groups.
We excluded from consideration Postdoctoral Fellows, Instructors with appointments in the Division of Continuing Education, Temporary Student instructors, and Visiting Professors.
See Eqs. (2), (3), (5), and (6) in Appendix 4 for details.
In the latter case, we used instructor-invariant course fixed effects specific to a course offering and semester that the course was being taught (e.g., fall semester Multivariable Calculus). The fixed effects allowed us to take into account the possibility of unobserved course and semester heterogeneity.
We would like to emphasize that in all cases, representative age is equal for males and females.
Results for Professors followed the same pattern, with the expected female SET score being higher than for a male Professor of the same age. The confidence intervals, however, are extremely wide (because there was only one female Professor in the sample) and are not displayed in Fig. 4. The difference for Professors was not statistically significant.
There was also a difference in sign for the interaction of gender with Professor (as we mention in the previous footnote, there is only one female professor in the sample).

References

Adams, M. J., & Umbach, P. (2012). Nonresponse and online student evaluations of teaching: Understanding the influence of salience, fatigue, and academic environments. Research in Higher Education, 53, 576–591.
Article Google Scholar
Algozzine, B. B. (2004). Student Evaluations of College Teaching: A practice in search of principles. College Teaching, 52(4), 134–141.
Article Google Scholar
Angrist, J. D., & Pischke, J.-S. (2010). The credibility revolution in empirical econometrics: How better research design is taking the con out of econometrics. Journal of Economic Perspectives, 24(2), 3–30.
Article Google Scholar
Barre, E. (2018). Research on student ratings continues to evolve. We should, too. Reflections on teaching and learning: The CTE blog. Rice Center for Teaching Excellence.
Google Scholar
Bettinger, E., Fox, L., Loeb, S., & Taylor, E. S. (2017). Virtual classrooms: How online college courses affect student success. American Economic Review, 107(9), 2855–2875.
Article Google Scholar
Boring, A. (2017). Gender biases in student evaluations of teaching. Journal of Public Economics, 145, 27–41.
Article Google Scholar
Boring, A., Ottoboni, K., & Stark, P. (2016). Student evaluations of teaching (mostly) do not measure teaching effectiveness. ScienceOpen Research. https://doi.org/10.14293/S2199-1006.1.SOR-EDU.AETBZC.v1.
Article Google Scholar
Centra, J. A., & Gaubatz, N. B. (2000). Is there gender bias in student evaluations of teaching? The Journal of Higher Education, 71(1), 17–33.
Article Google Scholar
Davies, M., Hirschberg, J., Lye, J., & Johnstone, C. (2010a). A systematic analysis of quality of teaching surveys. Assessment & Evaluation in Higher Education, 35(1), 83–96.
Article Google Scholar
Davies, M., Hirschberg, J., Lye, J., Johnston, C., & McDonald, I. (2010b). Systematic influences on teaching evaluations: The case for caution. Assessment & Evaluation in Higher Education, 35(1), 83–96.
Article Google Scholar
Esarey, J., & Valdes, N. (2020). Unbiased, reliable, and valid student evaluations can still be unfair. Assessment & Evaluation in Higher Education, 45(8), 1106–1120.
Article Google Scholar
Feldman, K. A. (1993). College students’ views of male and female college teachers: PartII: Evidence from students’ evaluations of their classroom teachers. Research in Higher Education, 34(2), 151–211.
Article Google Scholar
Key, E., & Ardoin, P. (n.d.). Gender bias in teaching evaluations: What can be done? (Working Paper).
Loeher, L.L.-M. (2006). Guide to evaluation of instruction. Regents of the University of California.
Google Scholar
MacNell, L., Driscoll, A., & Hunt, A. (2015). What’s in a name: Exposing gender bias in student ratins of teaching. Innovation in Higher Education, 40, 291–303.
Article Google Scholar
Marsh, H. W. (2007). Students’ evaluations of university teaching: a multidimensional perspective. In R. Perry & J. Smart (Eds.), The scholarship of teaching and learning in higher education: an evidence-based perspective (pp. 319–384). Springer.
Chapter Google Scholar
Marsh, H. W., & Roche, L. A. (1997). Making students’ evaluations of teaching effectiveness effective: The critical issues of validity, bias, and utility. American Psychologist, 52(11), 1187–1197.
Article Google Scholar
Martin, L. L. (2016). Gender teaching evaluations and professional success in political science. PS: Political Science and Politics, 49(2), 313–319.
Google Scholar
Mengel, F., Sauermann, J., & Zolitz, U. (2019). Gender bias in teaching evaluations. Journal of the European Economic Association, 17(2), 535–566.
Article Google Scholar
Miles, P., & House, D. (2015). The tail wagging the dog; An overdue examination of student teaching evaluations. International Journal of Higher Education, 4(2), 116–126.
Article Google Scholar
Mitchell, K., & Martin, J. (2018). Gender bias in student evaluations. American Political Science Association, 51, 648–652.
Google Scholar
Theall, M., & Franklin, J. (2001). Looking for bias in all the wrong places: A search for truth or a witch hunt in student ratings of instruction. New Directions for Instituional Research, 2001(109), 45.
Article Google Scholar
Zipser, N., & Mincieli, L. (2018). Administrative and structural changes in student evaluations of teaching and their effects on overall instructor scores. Assessment and Evaluation in Higher Education, 43(6), 995–1008.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Arts and Sciences, Harvard University, University Hall 1 South, Cambridge, MA, 02138, USA
Nina Zipser, Lisa Mincieli & Dmitry Kurochkin

Authors

Nina Zipser
View author publications
You can also search for this author in PubMed Google Scholar
Lisa Mincieli
View author publications
You can also search for this author in PubMed Google Scholar
Dmitry Kurochkin
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nina Zipser.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix 1

See Table 4

Table 4 Results of recent literature examining gender bias in SET scores

Full size table

.

Appendix 2

See Table 5

Table 5 Definitions of predictor variables

Full size table

.

Appendix 3

See Table 6

Table 6 Multivariate regression results, cross-divisional analysis

Full size table

.

Appendix 4 Regression Models for Calculus Instructors’ SET Scores

We test for gender effects using the following ordinary least squares (OLS) multivariate models:

$$SET_{ict} = \theta \cdot {\text{female}}_{i} + \beta_{0} + \varepsilon_{ict} ,$$

(1)

$$SET_{ict} = \theta \cdot {\text{female}}_{i} + \beta_{0} + \beta_{1} \cdot {\text{rank}}_{it} + {\upbeta }_{2} \cdot {\text{age}}_{it} + \beta_{3} \cdot {\text{age}}_{it}^{2} + \varepsilon_{ict} ,$$

(2)

$$SET_{ict} = \theta \cdot {\text{female}}_{i} + \beta_{0} + \beta_{1} \cdot {\text{rank}}_{it} + \beta_{2} \cdot {\text{age}}_{it} + \beta_{3} \cdot {\text{age}}_{it}^{2} + \beta_{4} \cdot {\text{female}}_{it} \cdot {\text{rank}}_{it} + \beta_{5} \cdot {\text{year - term + }}\beta_{6} \cdot {\text{female}}_{it} \cdot {\text{year-term }} + \varepsilon_{ict} ,$$

(3)

where index i enumerates instructors, c is the course offering,

$${\text{c}} \in \left\{ {{\text{Intro to Calculus}};{\text{ Calculus}},{\text{ Series and Differential Equations}};{\text{ and Multivariable Calculus}}} \right\};$$

and time is indexed by

$$t \in \left\{ {{\text{Fall }}\;2006,{\text{ Spring }}\;2007, \ldots { },{\text{ Spring }}\;2017} \right\}.$$

Here, SET_ict is the average score of instructor i for course c received in semester t, and female is the dummy variable that takes value 1 if the instructor is a female and 0 otherwise. The control variable rank represents a set of four dummy variables for the instructor being a Lecturer/Preceptor, Professor, Senior non-ladder instructor, or Teaching Assistant/Teaching Fellow. For example, if an instructor is a preceptor in a given semester, then the rank variable for this instructor will be (1, 0, 0, 0), while a teaching post-doc will be represented by (0, 0, 0, 0). We notice that faculty rank may change over time and thus, generally, rank is time-dependent. Finally, age_it is the instructor age in years at the beginning of a given semester. One can see that model (1) is equivalent to a t-test for a difference between mean SET scores in the two gender groups; in model (2) we control for instructor rank and age; and finally, interactions between faculty rank and gender and also between year-term and gender are added in (3). Here, year-term denotes 1, 2,…, 22 which correspond to Fall 2006, Spring 2007,…, Spring 2017, respectively. Alternatively to the OLS models (1, 2 and 3), we consider their variations as follows:

$$SET_{ict} = \theta \cdot {\text{female}}_{i} + \beta_{0} + \alpha_{ct} + \varepsilon_{ict} ,$$

(4)

$$SET_{ict} = \theta \cdot {\text{female}}_{i} + \beta_{0} + \alpha_{ct} + \beta_{1} \cdot {\text{rank}}_{it} + {\upbeta }_{2} \cdot {\text{age}}_{it} + \beta_{3} \cdot {\text{age}}_{it}^{2} + \varepsilon_{ict} ,$$

(5)

$$SET_{ict} = \theta \cdot {\text{female}}_{i} + \beta_{0} + \alpha_{ct} + \beta_{1} \cdot {\text{rank}}_{it} + \beta_{2} \cdot {\text{age}}_{it} + \beta_{3} \cdot {\text{age}}_{it}^{2} + \beta_{4} \cdot {\text{female}}_{it} \cdot {\text{rank}}_{it} + { }\beta_{6} \cdot {\text{female}}_{it} \cdot {\text{year-term}} + \varepsilon_{ict}$$

(6)

where α_ct are the instructor invariant fixed effects that satisfy the following constraints:

$$\mathop \sum \limits_{c} \mathop \sum \limits_{t} \alpha_{ct} = 0$$

in each of (4, 5 and 6) cases.

Appendix 5

See Table 7

Table 7 Regression results for SET scores of calculus instructors using multivariate regression models (1, 2 and 3) and (4, 5 and 6)

Full size table

.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zipser, N., Mincieli, L. & Kurochkin, D. Are There Gender Differences in Quantitative Student Evaluations of Instructors?. Res High Educ 62, 976–997 (2021). https://doi.org/10.1007/s11162-021-09628-w

Download citation

Received: 23 January 2020
Accepted: 13 February 2021
Published: 05 March 2021
Issue Date: November 2021
DOI: https://doi.org/10.1007/s11162-021-09628-w

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Are There Gender Differences in Quantitative Student Evaluations of Instructors?

Abstract

Access this article

Similar content being viewed by others

What’s in a Name: Exposing Gender Bias in Student Ratings of Teaching

Gender-biased evaluation or actual differences? Fairness in the evaluation of faculty teaching

Do Students Discriminate? Exploring Differentials by Race and Sex in Class Enrollments and Student Ratings of Instructors

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

Appendix 1

Appendix 2

Appendix 3

Appendix 4

Regression Models for Calculus Instructors’ SET Scores

Appendix 5

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Are There Gender Differences in Quantitative Student Evaluations of Instructors?

Abstract

Access this article

Similar content being viewed by others

What’s in a Name: Exposing Gender Bias in Student Ratings of Teaching

Gender-biased evaluation or actual differences? Fairness in the evaluation of faculty teaching

Do Students Discriminate? Exploring Differentials by Race and Sex in Class Enrollments and Student Ratings of Instructors

Notes

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Appendices

Appendix 1

Appendix 2

Appendix 3

Appendix 4

Regression Models for Calculus Instructors’ SET Scores

Appendix 5

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation