Skip to main content
Log in

A history of assessment in medical education

  • Invited Paper
  • Published:
Advances in Health Sciences Education Aims and scope Submit manuscript

Abstract

The way quality of assessment has been perceived and assured has changed considerably in the recent 5 decades. Originally, assessment was mainly seen as a measurement problem with the aim to tell people apart, the competent from the not competent. Logically, reproducibility or reliability and construct validity were seen as necessary and sufficient for assessment quality and the role of human judgement was minimised. Later, assessment moved back into the authentic workplace with various workplace-based assessment (WBA) methods. Although originally approached from the same measurement framework, WBA and other assessments gradually became assessment processes that included or embraced human judgement but based on good support and assessment expertise. Currently, assessment is treated as a whole system problem in which competence is evaluated from an integrated rather than a reductionist perspective. Current research therefore focuses on how to support and improve human judgement, how to triangulate assessment information meaningfully and how to construct fairness, credibility and defensibility from a systems perspective. But, given the rapid changes in society, education and healthcare, yet another evolution in our thinking about good assessment is likely to lurk around the corner.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Albanese, M. A., Mejicano, G., Mullan, P., Kokotailo, P., & Gruppen, L. (2008). Defining characteristics of educational competencies. Medical Education, 42(3), 248–255.

    Google Scholar 

  • Berendonk, C., Stalmeijer, R. E., & Schuwirth, L. W. T. (2013). Expertise in performance assessment: Assessors’ perspectives. Advances in Health Sciences Education, 18(4), 559–571.

    Google Scholar 

  • Boreham, N. C. (1994). The dangerous practice of thinking. Medical Education, 28, 172–179.

    Google Scholar 

  • Botsman, R. (2017). Who can you trust?: How technology brought us together and why it might drive us apart. New York: Hachette.

    Google Scholar 

  • Boud, D. (1990). Assessment and the promotion of academic values. Studies in Higher Education, 15(1), 101–111.

    Google Scholar 

  • Boud, D. (1995). Assessment and learning: Contradictory or complementary. In P. Knight (Ed.), Assessment for learning in higher education (pp. 35–48). London: Kogan.

    Google Scholar 

  • Canmeds. (2005). Retrieved 26 July, April 2017 from, http://www.royalcollege.ca/portal/page/portal/rc/canmeds.

  • Checkland, P. (1985). From optimizing to learning: A development of systems thinking for the 1990s. The Journal of the Operational Research Society, 36(9), 757–767.

    Google Scholar 

  • Chi, M. T. H., Glaser, R., & Rees, E. (1982). Expertise in problem solving. In R. J. Sternberg (Ed.), Advances in the psychology of human intelligence (pp. 7–76). Hillsdale: Lawrence Erlbaum Associates.

    Google Scholar 

  • Cilliers, F. J., Schuwirth, L. W. T., Adendorff, H. J., Herman, N., & Van der Vleuten, C. P. M. (2010). The mechanisms of impact of summative assessment on medical students’ learning. Advances in Health Sciences Education, 15, 695–715.

    Google Scholar 

  • Cilliers, F. J., Schuwirth, L. W. T., Herman, N., Adendorff, H. J., & Van der Vleuten, C. P. M. (2012). A model of the pre-assessment learning effects of summative assessment in medical education. Advances in Health Sciences Education, 17, 39–53.

    Google Scholar 

  • Cook, D. A., Kuper, A., Hatala, R., & Ginsburg, S. (2016). When assessment data are words: Validity evidence for qualitative educational assessments. Academic Medicine, 91(10), 1359–1369.

    Google Scholar 

  • Cooper, L., Orrell, J., & Bowden, M. (2010). Work integrated learning: A fuide to effective practice. Milton Park: Routledge.

    Google Scholar 

  • Cronbach, L. J. (1983). What price simplicity? Educational Measurement: Issues and Practice, 2(2), 11–12.

    Google Scholar 

  • Cronbach, L. J., & Meehl, P. E. (1955). Construct validity in psychological tests. Psychological Bulletin, 52(4), 281–302.

    Google Scholar 

  • Dijkstra, J., Galbraith, R., Hodges, B., McAvoy, P., McCrorie, P., Southgate, L., et al. (2012). Expert validation of fit-for-purpose guidelines for designing programmes of assessment. BMC Medical Education, 12(20), 1–8.

    Google Scholar 

  • Dijkstra, J., Van der Vleuten, C. P. M., & Schuwirth, L. W. T. (2010). A new framework for designing programmes of assessment. Advances in Health Sciences Education, 15, 379–393.

    Google Scholar 

  • Driessen, E., Van der Vleuten, C. P. M., Schuwirth, L. W. T., Van Tartwijk, J., & Vermunt, J. (2005). The use of qualitative research criteria for portfolio assessment as an alternative to reliability evaluation: A case study. Medical Education, 39(2), 214–220.

    Google Scholar 

  • Durning, S. J., Artino, A., Pangaro, L., Van der Vleuten, C., & Schuwirth, L. (2010). Redefining context in the clinical encounter: implications for research and training in medical education. Academic Medicine, 85(5), 894–901.

    Google Scholar 

  • Ebel, R. L. (1983). The practical validation of tests of ability. Educational Measurement: Issues and Practice, 2(2), 7–10.

    Google Scholar 

  • Epstein, R. M., & Hundert, E. M. (2002). Defining and assessing professional competence. The Journal of the American Medical Association, 287(2), 226–235.

    Google Scholar 

  • Ericsson, K. A., Krampe, R. T., & Tesch-Romer, C. (1993). The role of deliberate practice in the acquisition of expert performance. Psychological Review, 100(3), 363–406.

    Google Scholar 

  • Eva, K. (2003). On the generality of specificity. Medical Education, 37, 587–588.

    Google Scholar 

  • Eva, K. W., Neville, A. J., & Norman, G. R. (1998). Exploring the etiology of content specificity: Factors influencing analogic transfer and problem solving. Academic Medicine, 73(10), s1–s5.

    Google Scholar 

  • Friedman, L. W., & Friedman, H. H. (2008). The new media technologies: Overview and research framework. Available at SSRN 1116771.

  • Gingerich, A. (2015). Questioning the rater idiosyncrasy explanation for error variance by searching for mulitple signals within the noise. Maastricht: Maastricht University.

    Google Scholar 

  • Gingerich, A., Kogan, J., Yeates, P., Govaerts, M., & Holmboe, E. (2014). Seeing the ‘black box’differently: Assessor cognition from three research perspectives. Medical Education, 48(11), 1055–1068.

    Google Scholar 

  • Gingerich, A., Ramlo, S. E., Van der Vleuten, C. P. M., Eva, K. W., & Regehr, G. (2017). Inter-rater variability as mutual disagreement: Identifying raters’ divergent points of view. Advances in Health Sciences Education, 22(4), 819–838.

    Google Scholar 

  • Ginsburg, S., McIlroy, J., Oulanova, O., Eva, K., & Regehr, G. (2010). Toward authentic clinical evaluation: Pitfalls in the pursuit of competency. Academic Medicine, 85(5), 780–786.

    Google Scholar 

  • Ginsburg, S., Regehr, G., Lingard, L., & Eva, K. (2015). Reading between the lines: Faculty interpretations narrative evaluation comments. Medical Education, 49, 296–306.

    Google Scholar 

  • Ginsburg, S., Vleuten, C. P. M., Eva, K. W., & Lingard, L. (2017). Cracking the code: Residents’ interpretations of written assessment comment. Medical Education, 51, 401–410.

    Google Scholar 

  • Govaerts, M. (2008). Educational competencies or education for professional competence? Medical Education, 42(3), 234–236.

    Google Scholar 

  • Govaerts, M. J. B., Schuwirth, L. W. T., Van der Vleuten, C. P. M., & Muijtjens, A. M. M. (2011). Workplace-based assessment: Effects of rater expertise. Advances in Health Sciences Education, 16(2), 151–165.

    Google Scholar 

  • Govaerts, M. J. B., Wiel, M. W. J., Schuwirth, L. W. T., Vleuten, C. P. M., & Muijtjens, A. M. M. (2012). Workplace-based assessment: Raters’ performance theories and constructs. Advances in Health Sciences Education, 18, 1–22.

    Google Scholar 

  • Hager, P., & Gonczi, A. (1996). What is competence? Medical Teacher, 18(1), 15–18.

    Google Scholar 

  • Harrison, C. J., Könings, K. D., Dannefer, E. F., Schuwirth, L. W. T., Wass, V., & Van der Vleuten, C. P. M. (2016). Factors influencing students’ receptivity to formative feedback emerging from different assessment cultures. Perspectives on Medical Education, 5, 276–284.

    Google Scholar 

  • Harrison, C. J., Könings, K. D., Schuwirth, L., Wass, V., & Van der Vleuten, C. (2015). Barriers to the uptake and use of feedback in the context of summative assessment. Advances in Health Sciences Education, 20(1), 229–245.

    Google Scholar 

  • Harrison, C. J., Könings, K. D., Schuwirth, L. W. T., Wass, V., & Van der Vleuten, C. P. M. (2017). Changing the culture of assessment: the dominance of the summative assessment paradigm. BMC Medical Education, 17(1), 73.

    Google Scholar 

  • Hodges, B. (2013). Assessment in the post-psychometric era: Learning to love the subjective and collective. Medical Teacher, 35(7), 564–568.

    Google Scholar 

  • Hodges, B., & Lingard, L. (2012). The question of competence: Reconsidering medical education in the twenty-first century. Ithaka New York: Cornell University Press.

    Google Scholar 

  • Kane, M. T. (2006). Validation. In R. L. Brennan (Ed.), Educational measurement (Vol. 1, pp. 17–64). ACE/Praeger: Westport.

    Google Scholar 

  • Moonen-van Loon, J. M. W., Overeem, K., Donkers, H. H. L. M., Van der Vleuten, C. P. M., & Driessen, E. W. (2013). Composite reliability of a workplace-based assessment toolbox for postgraduate medical education. Advances in Health Sciences Education, 18(5), 1087–1102.

    Google Scholar 

  • Norcini, J., Blank, L. L., Arnold, G. K., & Kimball, H. R. (1995). The mini-CEX (clinical evaluation exercise);a preliminary investigation. Annals of Internal Medicine, 123(10), 795–799.

    Google Scholar 

  • Norman, G. (2009). Dual processing and diagnostic errors. Advances in Health Sciences Education, 14, 37–49.

    Google Scholar 

  • Norman, G. (2011). Chaos, complexity and complicatedness: Lessons from rocket science. Medical Education, 45, 549–559.

    Google Scholar 

  • Norman, G., Tugwell, P., Feightner, J., Muzzin, L., & Jacoby, L. (1985). Knowledge and clinical problem-solving. Medical Education, 19, 344–356.

    Google Scholar 

  • Norman, G. R. (1988). Problem-solving skills, solving problems and problem-based learning. Medical Education, 22, 270–286.

    Google Scholar 

  • Norman, G. R., Smith, E. K. M., Powles, A. C., Rooney, P. J., Henry, N. L., & Dodd, P. E. (1987). Factors underlying performance on written tests of knowledge. Medical Education, 21, 297–304.

    Google Scholar 

  • Norman, G. R., Van der Vleuten, C. P. M., & De Graaff, E. (1991). Pitfalls in the pursuit of objectivity: Issues of validity, efficiency and acceptability. Medical Education, 25(2), 119–126.

    Google Scholar 

  • Popham, W. J. (2009). Assessment literacy for teachers: Faddish or fundamental? Theory Into Practice, 48, 4–11.

    Google Scholar 

  • Schuwirth, L. W. T., & Van der Vleuten, C. P. M. (2006). A plea for new psychometrical models in educational assessment. Medical Education, 40(4), 296–300.

    Google Scholar 

  • Schuwirth, L. W. T., & Van der Vleuten, C. P. M. (2012). Programmatic assessment and Kane’s validity perspective. Medical Education, 46(1), 38–48.

    Google Scholar 

  • Schuwirth, L. W. T., Van der Vleuten, C. P. M., & Donkers, H. H. L. M. (1996). A closer look at cueing effects in multiple-choice questions. Medical Education, 30, 44–49.

    Google Scholar 

  • Schuwirth, L. W. T., Vleuten, C. P. M., & Durning, S. J. (2017). What programmatic assessment in medical education can learn from healthcare. Perspectives on Medical Education, 6, 1–5.

    Google Scholar 

  • Shanahan, M. E., Van der Vleuten, C., & Schuwirth, L. (2019). Conflict between clinician teachers and their students: the clinician perspective. Advances in Health Sciences Education, 25, 401–414.

    Google Scholar 

  • Shirky, C. (2010). Cognitive surplus: Creativity and generosity in a connected age. London: Penguin.

    Google Scholar 

  • Swanson, D. B. (1987). A measurement framework for performance-based tests. In I. Hart & R. Harden (Eds.), Further developments in Assessing Clinical Competence (pp. 13–45). Montreal: Can-Heal Publications.

    Google Scholar 

  • Swanson, D. B., & Norcini, J. J. (1989). Factors influencing reproducibility of tests using standardized patients. Teaching and Learning in Medicine, 1(3), 158–166.

    Google Scholar 

  • Swanson, D. B., Norcini, J. J., & Grosso, L. J. (1987). Assessment of clinical competence: Written and computer-based simulations. Assessment and Evaluation in Higher Education, 12(3), 220–246.

    Google Scholar 

  • Ten Cate, Th J. (2005). Entrustability of professional activities and competency-based training. Medical Education, 39, 1176–1177.

    Google Scholar 

  • Ten Cate, Th J, & Scheele, F. (2007). Competency-based postgraduate training: Can we bridge the gap between theory and clinical practice. Academic Medicine, 82, 542–547.

    Google Scholar 

  • Ulrich, W. (2001). The quest for competence in systemic research and practice. SystemsResearch and Behavioral Science, 18, 3–28.

    Google Scholar 

  • Valentine, N., Durnig, S. J., Shanahan, E. M. & Schuwirth, L. W. T. (accepted for publication). Fairness in human judgement in assessment: A hermeneutic literature review and conceptual framework. Advances in Health Sciences Education.

  • Valentine, N., & Schuwirth, L. W. T. (2019). Identifying the narrative used by educators in articulating judgement of performance. J Perspectives on Medical Education, 8(2), 1–7.

    Google Scholar 

  • Valentine, N., Wignes, J., Benson, J., Clota, S., & Schuwirth, L. W. T. (2019). Entrustable professional activities for workplace assessment of general practice trainees. Medical Journal of Australia, 210(8), 354–359.

    Google Scholar 

  • Van der Vleuten, C. P. M. (1996). The assessment of professional competence: Developments, research and practical implications. Advances in Health Science Education, 1(1), 41–67.

    Google Scholar 

  • Van der Vleuten, C. P. M., Norman, G. R., & De Graaff, E. (1991). Pitfalls in the pursuit of objectivity: Issues of reliability. Medical Education, 25, 110–118.

    Google Scholar 

  • Van der Vleuten, C. P. M., & Schuwirth, L. W. T. (2005). Assessing professional competence: From methods to programmes. Medical Education, 39(3), 309–317.

    Google Scholar 

  • Van der Vleuten, C. P. M., Schuwirth, L. W. T., Driessen, E. W., Dijkstra, J., Tigelaar, D., Baartman, L. K. J., et al. (2012). A model for programmatic assessment fit for purpose. Medical Teacher, 34, 205–214.

    Google Scholar 

  • Van der Vleuten, C. P. M., Schuwirth, L. W. T., Driessen, E. W., Govaerts, M. J. B., & Heeneman, S. (2015). 12 Tips for programmatic assessment. Medical Teacher, 37(7), 641–646.

    Google Scholar 

  • Van der Vleuten, C. P. M., & Swanson, D. (1990). Assessment of clinical skills with standardized patients: State of the art. Teaching and Learning in Medicine, 2(2), 58–76.

    Google Scholar 

  • Van der Vleuten, C. P. M., Van Luyk, S. J., & Beckers, H. J. M. (1988). A written test as an alternative to performance testing. Medical Education, 22, 97–107.

    Google Scholar 

  • Van Merrienboer, J. J. G., & Sweller, J. (2005). Cognitive load theory and complex learning: Recent developments and future directions. Educational Psychology Review, 17(2), 147–177.

    Google Scholar 

  • Ward, W. C. (1982). A comparison of free-response and multiple-choice forms of verbal aptitude tests. Applied Psychological Measurement, 6(1), 1–11.

    Google Scholar 

  • Watling, C., Driessen, E., Van der Vleuten, C. P. M., Vanstone, M., & Lingard, L. (2013). Beyond individualism: Professional culture and its influence on feedback. Medical Education, 47(6), 585–594.

    Google Scholar 

  • Weller, J. M., Misur, M., Nicolson, S., Morris, J., Ure, S., & Jolly, B. (2014). Can I leave the theatre? A key to more reliable workplace-based assessment. British Journal of Anaesthesia, 112(6), 1083–1091.

    Google Scholar 

  • Young, M., Thomas, A., Gordon, D., Gruppen, L., Lubarsky, S., Rencic, J., et al. (2019). The terminology of clinical reasoning in health professions education: Implications and considerations. Medical Teacher, 41, 1–8.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lambert W. T. Schuwirth.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Schuwirth, L.W.T., van der Vleuten, C.P.M. A history of assessment in medical education. Adv in Health Sci Educ 25, 1045–1056 (2020). https://doi.org/10.1007/s10459-020-10003-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10459-020-10003-0

Keywords

Navigation