The content aspect of validity in a rubric-based assessment system for course syllabuses

https://doi.org/10.1016/j.stueduc.2020.100971Get rights and content

Highlights

  • Content validity index of .80 is acceptable when more than 10 experts participate.

  • Fitness for the context and wording are crucial aspects in rubric design.

  • Wording problems are also caused by rubrics' flaws in suitability to context.

  • Experts appraise rubrics from a critical point of view on teaching and management.

  • Experts primarily relate rubrics to teaching innovation and education improvement.

Abstract

The growing trend among universities to promote systems of programme and course evaluation entails more responsibility for faculties and departments. These systems require resources to ensure that they are not only valid and reliable but also effective and sustainable. The design of rubric-based assessment systems may provide a solution, but there is a gap in the research on curriculum evaluation concerning their use and validation. We examine the content aspect of validity in a rubric-based assessment system for course syllabuses using a mixed method that combines an analysis of the agreement among 23 experts with a phenomenographic study. With data gathered through a questionnaire linked to the Delphi technique, content validity indexes were calculated and the experts' different perspectives were identified. The content validity indexes (greater than .80) met the standards set out in literature, and the qualitative study of the experts' feedback showed three different perspectives on the system's use. Beyond providing evidence of the system's content validity, the study highlights the extent to which it is important to give appropriate consideration to experts' – and by extension final users' – experience in order to ensure the successful implementation of rubric-based assessment systems.

Introduction

Recently, accreditation programmes have adopted an evolutionary and developmental approach that accords with the main purpose of educational quality assurance systems, which is to provide the means necessary for the continuous improvement of educational programmes (e.g., Boyle & Bowden, 1997). It is not enough to use the rates of academic success customarily cited by universities as evidence of the smooth running of their degree programmes. This is partly due to the disrepute arising from grade inflation (e.g., Bachan, 2017; Chowdhury, 2018; Finefter-Rosenbluh & Levinson, 2015) and partly because such rates of academic success say nothing of the impact that university programmes have on their students' learning processes.

This state of affairs has led universities to focus on programme and course evaluation, resulting in increased responsibility for faculties and departments. These must provide evidence that their degree programmes foster high-level learning outcomes and that these outcomes meet the needs of society. This involves the implementation and upgrading of systems to support data capture and entry on student learning, the use of appropriate systems of analysis, and the provision of planning and rationales for policies to improve degree programmes in accordance with the interpretation of the findings (Goldstein, 2010). The focus of evaluation is no longer the design of an educational programme, but rather on its performance (Caffarella, 2002; Hixon, Barczyk, Buckenmeyer, & Feldman, 2011). This, in turn, has even led to studies on meta-assessment, that is, the evaluation of the suitability of programme assessment processes (Orem, 2012).

In addition to the conceptual difficulties inherent in the evaluation of educational quality, programme and course evaluation must also solve problems of implementation. The design and application of evaluation systems require plentiful resources (e.g., Uribe, 2013); therefore, they must be not only valid and reliable, but also effective and sustainable (e.g., Barrie, Hughes, Crisp, & Bennison, 2014). Steps must be taken to compensate for the lack of a suitable assessment culture among faculty, due partly to inadequate pedagogical training (Grainger, Adie, & Weir, 2016) and partly to the persistence of a university tradition that impedes a paradigm shift in education (see Boyle & Bowden, 1997; Brownell & Tanner, 2012). The latter factor is the more important one because it affects faculty commitment to the continuous assessment of programmes. For instance, Brancaccio-Taras et al. (2016) developed a programme assessment system in which one of the criteria was the presence of a suitable institutional atmosphere for the implementation of evidence-based teaching practices.

The implementation of programme and course evaluation involves giving a prominent role to faculty members (Gerretson & Golson, 2005), which in turn has led to arguments in favour of shifting the focus from programme-level to course-level assessment (Reed, Levin, & Malandra, 2011). Consequently, there is a pressing need to bolster the creation of faculty learning communities (e.g., O’Malley, 2010; Ward & Selvester, 2012) that are open to teachers from different universities who teach in the same field or discipline (Sefcik, Bedford, Czech, Smith, & Yorke, 2018). In fact, these communities can serve as professional environments seeking to enhance learning and professional development governed by the principles of trust, support and collegiality (Menéndez-Varela & Gregori-Giralt, 2018). With adequate institutional support, these professional safe environments could also foster the creation of teaching innovation groups to act as levers of change within faculties and departments. For example, teaching innovation groups could promote the design and application of programme and course assessment systems or their transfer from other contexts.

Analysing the teaching process as a unified construct is complex, and it is difficult for faculties and departments to gather direct measures in order to carry out an analysis that is valid, reliable and sustainable. This is why universities have promoted the collection of indirect evidence on quality assurance in addition to survey-based studies. Thus, recent research on programme and course evaluation has pushed forward in three directions. First, studies have examined whether a graduate will have attained the stated profile based on the distribution of competencies in the curriculum. This is exemplified by research on curriculum mapping (e.g., Perera, Babatunde, Zhou, Pearson, & Ekundayo, 2017; Veltri, Webb, Matveev, & Zapatero, 2011; Wijngaards-de Meij & Merx, 2018). Second, studies have checked whether students have attained the competencies set out in curricula; this focus has proved fertile ground for the use of rubrics. Prominent examples include analyses of core competencies, primarily the competencies of information literacy (e.g., Whitlock & Ebrahimi, 2016) and oral or written communication (e.g., García-Ros, 2011; Good, Osborne, & Birchfield, 2012), but also including studies on specific competencies (e.g., Romkey, Chong, & El Gammal, 2015; Tractenberg, Umans, & McCarter, 2010). Third, studies have verified whether the learning environments set out in course syllabuses are consistent with the pedagogical principles of a competency-based higher education.

The course syllabus is a document that lays out the learning outcomes, content and teaching environment that define a course. The quality of a course syllabus requires that all of its components should be aligned, but it also calls for a clear and specific explanation of the components so that a student can make an informed decision on whether or not to enrol in a course or degree programme. An analysis of the content of syllabuses does not produce direct evidence of the actual learning process that takes place in the classroom. However, it does provide information on: a) how well the teaching design is aligned with the student learning outcomes; b) the teaching culture and practices of the faculty; and c) the extent of their commitment to a competency-based educational model. Teacher-specific syllabuses are typically accessible only to enrolled students, but they co-exist in some university systems with generic syllabuses that are the product of a consensus among the teachers of a single course and they establish the general framework of the course. Course syllabuses are used as teaching resources, course plans, and evidences for teacher evaluation and programme accreditation (Grunert O’Brien, Millis, & Cohen, 2008; Slattery & Carlson, 2005; Willingham-McLain, 2011). In these roles, they function as communication tools and educational contracts (Fink, 2012; Parkes & Harris, 2002; Singham, 2007). The fact that generic syllabuses are sometimes available to the public makes them indicators of the attention that a university pays to its educational mission. Being useful to compare course programmes (e.g., Álvarez-Pérez, González Morales, López-Aguilar, Peláez Alba, & Peña Vázquez, 2018), generic syllabuses are analysed by the national quality agencies in the processes of programme accreditation (e.g., National Agency for Quality Assessment & Accreditation, 2013).

As shown in research spanning from the study by Bers, Davis, and Taylor (2000) to more recent work conducted by Goodwin, Chittle, Dixon and Andrews (2018) and Mathers, Finney, and Hathcoat (2018), there remains a large scope for improvement in aligning course syllabuses with the teaching practices that are most highly valued in the literature. Consequently, this line of evaluation research needs to be incorporated into already existing quality assurance systems at universities. In addition, syllabus content analysis is the type of programme and course assessment that is least time-consuming (Stanny, Gonzalez, & McGowan, 2015; Willingham-McLain, 2011), that best takes advantage of the faculty's expert knowledge and that can most directly and most immediately be useful in faculty development (Bers et al., 2000).

An examination of the latest publications reveals a variety of approaches, methodologies and evaluation tools, illustrating the interest in the topic and the exploratory stage of current research. There have been survey-based studies, such as the one by Bergsmann, Klug, Burger, Först, and Spiel (2018), who analysed the presence of competency-based teaching and real student competencies and did so using a screening model with an online questionnaire to which students and teachers alike responded. Iudica (2011) made use of two checklists to examine the alignment between course syllabuses in educational technology leadership and the state and national technology standards. Goodwin, Chittle, Dixon, & Andrews (2018) put forward a qualitative analysis of the information on learning outcomes, reading requirements, learning activities, assessment types, and policy listing and adherence as set out in the course syllabuses of an undergraduate programme. Lastly, the way in which assessment systems are described in course syllabuses has received special attention from a number of viewpoints: a) whether students receive appropriate communication about assessment aims (Thomas et al., 2018); b) whether the assessment system is aligned with learning outcomes (Sefcik et al., 2018); and c) which assessment modes and instruments are most commonly used in educational contexts (Tucker, 2012).

Rubrics are also valuable resources because they have given rise to the largest number of studies in the field in question. Brancaccio-Taras et al. (2016), Halim (2008) and Stanny et al. (2015) are examples of studies that analyse all aspects of the learning environments set out in course syllabuses and their alignment with the pedagogical principles and best teaching practices that are most widely recognised in the literature. Raybon (2012) and Legon (2015) have the distinctive characteristic that their rubrics feature specific criteria for online courses. However, it is more common to find rubrics used to assess whether the course syllabuses are consistent with the principles of learner-centred teaching (Blumberg & Pontiggia, 2011; Blumberg, 2009; Cullen & Harris, 2009), students' meaningful learning (Koh, 2013) or the assessment for learning (Alonzo, Mirriahi, & Davison, 2018; Wolf & Goodwin, 2007).

At a time when the rubric-based assessment systems for student performance had not yet reached a stage of maturity in higher education, Jonsson and Svingby (2007) and Reddy and Andrade (2010) confirmed the need for a greater number of studies on validity that have rigorous research methods and analyses. More recently, Dawson (2017) found very few publications that were suitable for replication studies because they contained insufficient information on their research method. Unsurprisingly, this is also the case now with the rubric-based assessment systems for course syllabuses. Of the studies cited earlier, only those of Halim (2008), Koh (2013) and especially Alonzo et al. (2018) address the matter of validity.

The present study is a contribution to the as yet limited body of literature on the rubric-based assessment systems for course syllabuses and, more specifically, to the models used to assess all the components set out in course syllabuses. The proposed rubric has six dimensions that reflect the customary sections of course syllabuses – i.e., learning outcomes, course content, learning resources, learning activities, learning mode, and assessment – and it features four performance levels (see Appendix A). The aim of this paper is to analyse some of the aspects that underpin the validity of the inferences that are drawn from the application of the rubric. Accordingly, two research questions are posed: (1) What rating is given to the content aspect of validity for the rubric-based assessment system? and (2) What are the professional experiences and concerns shown by the experts who took part in its evaluation?

Section snippets

Study context

The European Higher Education Area (EHEA), which was launched in 2010, is a joint undertaking of 48 European countries that are cooperating in the construction of a framework of comparable and compatible higher education systems. As a result, national systems of higher education have begun to be regulated by common quality standards. Also, the European Association for Quality Assurance in Higher Education (ENQA) and national organisations for quality assurance have been set up to conduct the

Method

The approach is based on Messick's unified construct validity theory, which states that "validity is an integrated evaluative judgment of the degree to which empirical evidence and theoretical rationales support the adequacy and appropriateness of interpretations and actions based on test scores or other modes of assessment" (Messick, 1990, p. 5). Messick's concept of validity has three corollaries: a) validity is a unified construct grounded on the integration of the analysis of empirical

The extent of agreement among the experts

Table 2 shows the content validity indexes for items and the scale-level content validity indexes for each component and for the overall scale (research data available in [dataset] Menéndez-Varela & Gregori-Giralt, 2020).

As Table 2 shows, only seven of the 52 items had item-level content validity indexes lower than .78, which is the threshold set by Lynn (1986) when there are between six and ten experts. Given that only 13.5% of the items were affected and that the present study featured more

Discussion

From the perspective of unified construct validity theory, both empirical evidence and theoretical rationales are required to attribute validity consistently. And the content aspect of validity is the sole facet that directly examines the degree to which an assessment system consists only of the most relevant and representative components of the construct at issue and is not undermined by other irrelevant elements. In addition to the previously cited studies by Messick, Cronbach (1989)

Limitations and further research

The analysis of the content aspect of validity for this rubric-based assessment system brought together a number of experts that exceeded the range of seven to ten that has been common in similar studies. The only exceptions are the study conducted by Erlich and Russ-Eft (2012), which drew on 19 experts, and the study carried out by Stanford et al. (2016), which succeeded in involving over 70 participants in the validation process. Consequently, the first limitation does not lie in the size of

Conclusion

This paper presents a study focusing on a rubric-based assessment system for course syllabuses in order to mitigate two gaps existing in the literature on rubrics: their use in curriculum evaluation and the limited attention that has been given to an analysis of the content aspect of validity. As an added value, the research also tackles its object of study with methodologies that are not widely used in the Education Sciences: specifically by combining a quantitative analysis of the content

Funding

This work was supported by the Institute for Professional Development-ICE [REDICE18-1980]; the Vice-rectorate for Teaching and Academic Planning and the Programme for Research, Innovation and Improvement of Teaching and Learning at the University of Barcelona [GINDOC-UB/103].

Declarations of interest

None.

References (87)

  • C. Baily et al.

    Conceptual assessment tool for advanced undergraduate electrodynamics

    Physical Review Physics Education Research

    (2017)
  • S. Barrie et al.

    Assessing and assuring Australian Graduate learning outcomes: principles and practices within and across disciplines, final report

    (2014)
  • E. Bergsmann et al.

    The competence screening questionnaire for higher education: Adaptable to the needs of a study programme

    Assessment & Evaluation in Higher Education

    (2018)
  • T.H. Bers et al.

    The use of syllabi in assessments: Unobtrusive indicators and tools for faculty development

    Assessment Update

    (2000)
  • F.L. Bird et al.

    Improving marking reliability of scientific writing with the Developing Understanding of Assessment for Learning programme

    Assessment & Evaluation in Higher Education

    (2013)
  • P. Blumberg

    Developing learner-centered teaching: A practical guide for faculty

    (2009)
  • P. Blumberg et al.

    Benchmarking the degree of implementation of learner-centered approaches

    Innovative Higher Education

    (2011)
  • P. Boyle et al.

    Educational quality assurance in universities: An enhanced model

    Assessment & Evaluation in Higher Education

    (1997)
  • L. Brancaccio-Taras et al.

    The PULSE vision & change rubrics, version 1.0: A valid and equitable tool to measure transformation of life sciences departments at all institution types

    Life Sciences Education

    (2016)
  • S.E. Brownell et al.

    Barriers to faculty pedagogical change: lack of training, time, incentives, and tensions with professional identity?

    CBE–Life Sciences Education

    (2012)
  • R. Caffarella

    Planning programs for adult learners. A practical guide for educators, trainers, and staff developers

    (2002)
  • J.J. Carrión Martínez

    La guía docente: ¿burocracia o reflexión?

    I Jornadas sobre experiencias piloto de implantación del crédito europeo en las universidades andaluzas [I Conference on pilot experiences of implementation of European credit in Andalusian universities]

    (2006)
  • F. Chowdhury

    Grade inflation: Causes, consequences and cure

    Journal of Education and Learning

    (2018)
  • L.J. Cronbach

    Construct validation after thirty years

  • B.E. Crotwell Timmerman et al.

    Development of a’ universal’ rubric for assessing undergraduates’ scientific reasoning skills using scientific writing

    Assessment & Evaluation in Higher Education

    (2011)
  • R. Cullen et al.

    Assessing learner-centredness through course syllabi

    Assessment & Evaluation in Higher Education

    (2009)
  • P. Dawson

    Assessment rubrics: Towards clearer and more replicable design, research and practice

    Assessment & Evaluation in Higher Education

    (2017)
  • J.L. Docktor

    Development and validation of a physics problem-solving assessment rubric (Doctoral dissertation)

    (2009)
  • J.L. Docktor et al.

    Assessing student written problem solutions: A problem-solving rubric with application to introductory physics

    Physical Review Physics Education Research

    (2016)
  • R.A. Ellis et al.

    Managing quality improvement of eLearning in a large, campus-based university

    Quality Assurance in Education

    (2007)
  • R.J. Erlich et al.

    Assessing academic advising outcomes using social cognitive theory: A validity and reliability study

    NACADA Journal

    (2012)
  • I. Finefter-Rosenbluh et al.

    What is wrong with grade inflation (if anything)?

    Philosophical Inquiry in Education

    (2015)
  • S.B. Fink

    The many purposes of course syllabi: which are essential and useful?

    Syllabus

    (2012)
  • R. García-Ros

    Analysis and validation of a rubric to assess oral presentation skills in university contexts

    Electronic Journal of Research in Educational Psychology

    (2011)
  • H. Gerretson et al.

    Synopsis of the use of course-embedded assessment in a medium sized public university’s general education program

    The Journal of General Education

    (2005)
  • N. Goldstein

    The program manager’s guide to evaluation

    (2010)
  • A. Goodwin et al.

    Taking stock and effecting change: Curriculum evaluation through a review of course syllabi

    Assessment & Evaluation in Higher Education

    (2018)
  • P. Grainger et al.

    Quality assurance of assessment and moderation discourses involving sessional staff

    Assessment & Evaluation in Higher Education

    (2016)
  • J. Grunert O’Brien et al.

    The course syllabus: A learning-centered approach

    (2008)
  • J.M. Guerra García

    La burocracia: un factor limitante en la investigación [Bureaucracy: a limiting factor in research]

    Chronica naturae

    (2013)
  • S.M.A. Halim

    The Effect of using some professional development strategies on improving the teaching performance of English language student teacher at the Faculty of Education, Helwan University in the Light of Pre-Service Teacher Standards (Doctoral dissertation)

    (2008)
  • E. Hixon et al.

    Mentoring university faculty to become high quality online educators: A program evaluation

    Online Journal of Distance Learning Administration

    (2011)
  • A.M. Iudica

    University educational leadership technology course syllabi alignment with state and national technology standards (Doctoral dissertation)

    (2011)
  • Cited by (6)

    • Measuring personalized learning through the Lens of UDL: Development and content validation of a student self-report instrument

      2022, Studies in Educational Evaluation
      Citation Excerpt :

      In addition, we collected qualitative data on experts’ feedback on the relevance, clarity, comprehensiveness, and wording of PLSI items. Using the mixed-method for collecting and analyzing both expert quantitative ratings and qualitative feedback provides more credibility to the results of content validity of PLSI (Gregori-Giralt & Menéndez-Varela, 2021). Purposive sampling was used to recruit a panel of UDL experts from CAST2 and UDL-IRN3, two organizations that help lead the growth and development of UDL.

    • Team intuition and creativity in new product development projects: A multi-faceted perspective

      2021, Journal of Engineering and Technology Management - JET-M
      Citation Excerpt :

      In this way, we aimed to expand knowledge of team intuition by discovering 1) whether there are different aspects of team intuition and 2) if there are, then how to identify each intuition aspect’s critical and distinct features in the NPD team context. In a second quantitative study, we first generate question items for each team intuition aspect based on the interviews conducted in Study 1 and supplementary past individual and team levels of studies, as recommended by Gregori-Giralt and Menendez-Varela (2021). We then test the relationship between team intuition and team creativity.

    View full text