Introduction

In this paper, we revisit the contextual, complex and dynamic nature of research quality (Paradeise and Thoenig 2015). We mobilise a historical overview of the concept of, and concerns regarding, research quality, and how this developed alongside other major elements of the modern constitution of research. From this we develop a novel framework to study and understand research quality based upon three key dimensions. First, we distinguish between co-existing research quality notions that originate in research fields (F(ield)-type) and research spaces (S(pace)-type). Second, we draw upon existing studies of research quality (Polanyi 1962; Gulbrandsen 2000; Lamont 2009) to explicate its attributes. Third, we use contemporary studies of the science system and its dynamics (Whitley 2000; Whitley et al. 2018; Nedeva 2013) to identify the organisational contexts where notions of research quality emerge, are contested and institutionalised. This multi-dimensional framework and its components, we believe, affords opportunities to study issues around research quality as an empirical question beyond general, user driven definitions. It also shifts the focus of study towards the processes and mechanisms through which ‘good research’ is recognised.

Issues around research quality have been around, in different guises, for as long as modern science. Early 19th-century debates on the demarcation between science and non-science dominated logical positivism (Caldwell 2010). Later, debates on quality in science were anchored in notions of truth, pragmatic acceptance (Chalmers 2013) or the social and intellectual conditions of scientific knowledge (Merton 1942). These, and we argue more recent, approaches to study research quality share one key feature: they regard judgements about the quality of science as an endogenous matter best left to the discretion of knowledge communities.

Here we argue that following the advent of research policy, and associated imperatives for increasing levels of accountability and legitimacy, mechanisms for constituting research quality notions that were once reserved for highly professionalised knowledge communities have extended to encompass notions generated within policy and funding domains. Most importantly, we see that research quality notions originating in research fields, or knowledge communities, and in policy and funding domains coexist.

This coexistence of research quality notions creates complex and multi-dimensional dynamics implying that research quality cannot adequately be studied and understood as a unitary notion. Approaches allowing more nuanced, structured and multi-faceted investigation are required. Current approaches to study research quality, we contend, are deficient to provide tools for such nuanced understanding. Hence, we propose to shift focus away from questions that essentially address what research quality is, and/or how to measure it, towards the mechanisms through which dominant notions of research quality become established and the social and intellectual tensions arising from co-existence of (potentially) conflicting quality notions.

Our paper is structured as follows. First, we discuss the context of previous approaches to research quality. Second, we introduce in turn the three component dimensions of our framework, encompassing: (i) types of quality notions; (ii) attributes of quality; and (iii) sites where notions are established and institutionalised. Third, we combine these three dimensions into an overall framework to study and understand research quality notions. We then discuss ways in which this proposed approach stands to change the research agenda.

Context for (Re)framing Notion(s) of Research Quality

Throughout most of its history the science system has been comprised of a multiplicity of research fields characterised by specific, and diverse, quality cultures (Feuer et al. 2002). Research field quality notions and standards were developed, to a large extent tacitly and without theoretical articulation, to meet the demands of the knowledge domain and, later, of users of knowledge results. This included more applied domains and communities, the achievements of which would be validated using primarily bureaucratic, military, and/or industrial procedures (as outlined by Shapin 2008).

Diverse knowledge communities worked within specific structures and communication forums, predominantly university departments, national and international conferences, and international journals and publishing houses. Even though research fields, in this sense, have always been predominantly global – a point stressed by Ben-David (1971) and Merton (1942) – domestic languages and local conditions functioned as important complements, evidenced in national modes and styles of work and communication, most notably in humanities and social sciences (Fourcade 2009), but to some degree visible in all fields (Salö 2015, 2017). Furthermore, trans-disciplinary communities have blended and transferred methods and theories across institutional and organisational settings, e.g. molecular biology and physics (Keller 1990), and humanities (Sörlin 2018).

This historical co-existence of diverse research fields with specific structure, communication systems and organisational arrangements implies that variable, context-dependent and flexible notions of research quality have also traditionally co-existed. However, despite being different in specifics, these quality notions, importantly, include notions predominantly intrinsic to science concerns.

A different type of research quality notions started to emerge in the mid-20th century. These were marked by the advent of (national) science policy, initially focused mainly on national security and competitiveness (Mukerji 1989) but later shifting attention to the efficiency of the science system on several scales, including regionally and globally (Nedeva and Boden 2006; Geuna and Martin 2003; Sörlin 2007; Vessuri et al. 2014; Lepori et al. 2018) and commercial application of results (Etzkowitz et al. 1998; Jacob et al. 2002; Howells et al. 1998).

Recently, and partly driven by the global scope of research, understanding of research quality by national governments has been somewhat converging. Notions of ‘research excellence’, building on expectations for international research influence, and hegemony, underpin many national evaluation regimes and systems (Flink and Peter 2018). Furthermore, this alignment of quality notions is enabled by the sophistication of techniques to monitor international linkages and visibility of research (cf. Larédo and Mustar 2001; Oreskes and Krige 2014), including the widespread application of indicators to summarise and often commodify performance and understandings of research quality (van Raan et al. 1989).

Rankings, and other instruments for institutional comparisons, draw heavily on international publications and their impact to present a more fine-grained analysis of scientific contributions (Piro and Sivertsen 2016). Journals have become proactive in cultivating their impact profiles, soliciting research with potentially high impact and enticing researchers to publish more ‘sensational’ results. Significant differences between countries and fields in their quality cultures have been challenged by global communication forums, mobility of scientific elites, and convergence pressures from international standard setters in different scientific areas (Douglass 2015; Sarewitz 2016; cf. Fourcade 2009 on the diminishing role of national quality cultures in economics).

A rapidly growing set of techniques for selection and assessment has developed, largely replacing the trust-based system of science. Funding for science and research has become subject to selectivity and competition. Universities, previously rigorously regulated but seldom or never compared or ranked, are now subjected to global quality standards largely outside their influence; and governments, funders, students and citizens pay attention to their performance in these ‘quality contests’ (see, for example, Paradeise and Thoenig 2015).

A seemingly global quality standard has emerged, in which individual researchers as well as scientific organisations, subject fields, and even countries can gauge their relative positioning. This has been emulated by funding agencies, which increasingly see themselves as either leader in setting quality ‘gold standards’ (the European Research Council is an example, see Chou and Gornitzka 2014; Edler et al. 2014; Nedeva et al. 2012; Flink 2016), or as followers adopting these standards, like the Research Council of Norway (Benner and Öquist 2014). Measurable output catches attention, and countries (and institutions, and individuals) adapt to this.

Overall, this crude overview suggests a historical path that we frame as the gradual emergence of two separate but co-existing types of research quality notions. These represent different social contexts: the context of the research field and that of the research funding and policy space (after Nedeva 2013).

Framework Dimensions to Study and Understand Research Quality

The above review of background and context leads us to the first dimension of our framework: two co-existing types of research quality notions. One type originates within research fields and is negotiated and established by the specialised knowledge communities to which these are assumed to have validity. This we label as F(ield)-type research quality notions. The other originates within research policy and funding spaces (i.e. research spaces, see note 1), and is advanced by knowledgeable lay groups and will often be assumed to have validity across different fields. We refer to it as S(pace)-type research quality notions. In this section we describe this first component in more detail. We then introduce the two further dimensions we assert are required to produce an overall framework for nuanced, multi-dimensional study and understanding of research quality: attributes of quality notions; and sites of contestation/institutionalisation.

First Dimension: Co-existing F(ield) and S(pace) Type Research Quality Notions

F(ield)-type notions of research quality originate in research fields. They are shaped by specialised, albeit fragmented, knowledge communities characterised by high level entry requirements, professional training, unified research practices and recognised bodies of knowledge (Cahan 2003; Höhle 2015). Hence, quality judgements are anchored in knowledge pools and/or conditions necessary to enhance these pools. For example, research is judged as extending and validating the existing knowledge or as pushing boundaries by developing theories, methods and approaches in the field.

This type of research quality notions may incorporate criteria, and standards, around properties of knowledge (e.g. original, reliable, relevant to the field, useful for further knowledge production, reproducible etc.), professional competence (reputation, ethics etc.) and intellectual and material conditions for research (method, theoretical grounding, instrumentation, experimental set up etc.).

Finally, F-type research quality notions are enforced predominantly through peer judgement and peer review practices. These are used at multiple selection points, including recruitment and promotion of research staff, publishing, conference participation, and access to resources in the field like instrumentation, materials and funding.

S-type notions of research quality, on the other hand, originate in policy and funding spaces (‘research spaces’, see Nedeva 2013).Footnote 1 They are developed and established by knowledgeable lay groups. These may include policy groups, administrators, research organisation leaders and research funding agency staff. Notably, researchers from neighbouring and more epistemically distant research fields could also be considered to be knowledgeable but lay groups.Footnote 2

Here judgement is anchored in considerations exogenous to the field’s specific knowledge pools. These considerations have historically been manifested as concerns for the social and economic contribution of, and from, science. When knowledgeable lay groups develop research quality notions, quality standards may become proxies, as opposed to substantive standards and/or general reputation. Using the reputation, and impact factor, of specific journals as a proxy for the quality of research papers, for instance, is an example. Lastly, S-type research quality notions are enforced through evaluation regimes that may or may not involve some variant of peer review.

The main features of F-type and S-type research quality notions are summarised in Table 1. This separates out the subject of the notions, how judgement is anchored and enforced, and whether quality standards involve substantive (F-type) or proxy (S-type) based judgements. This provides us with some key differences in the origins, mechanisms and processes associated with these two ‘pure’ types.

Table 1 Types of research quality notions

From our earlier context review we see that F-type quality notions preceded S-type quality notions. With the distinction between the two types thus made, opening quality standards from research fields to scrutiny and influence of actors in research spaces would seem to be one of the key contemporary changes in the dynamics of the science system.

Most importantly to study research quality we should also understand that the two types – F-type and S-type – co-exist. This co-existence, and the possibility for tensions generated by it, opens a novel, and exciting, research agenda on research quality, research quality standards and how these are constituted and established.

Second Dimension: Research Quality Attributes

An overview of the, otherwise diverse, literature on quality yields three attributes of research considered important for the consensus of what is ‘good’ research. These are originality/novelty, plausibility/reliability, and the value or usefulness of the research. Notably, these are composite categories of attributes and may have very different content in different types of research.

Back in the 1960s, Michael Polanyi (1962/2000), outlined ‘standards of scientific merit accepted by the scientific community’ (Polanyi 2000: 4) including originality, plausibility/reliability and scientific value. In Polanyi’s terms, plausibility refers to rejecting fraud and conclusions, which “appear to be unsound in light of current scientific knowledge” (Polanyi 2000: 5). Scientific value denotes the “systematic importance” of a contribution, “the intrinsic interest of the subject-matter”, as well as its accuracy, whereas originality is assessed by the degree of “unexpectedness of a discovery” (Polanyi 2000: 5–6).

Weinberg, on the other hand, argued the need for criteria to prioritise scientific fields and added external criteria such as technological and social merit (Weinberg 1963).Footnote 3 Hence, still in the 1960s we find emphasis on basic scientific standards to describe research quality, more specifically scientific merit, and a clear distinction between criteria internal and external to science.

This key distinction outlined by Polanyi and Weinberg reappears in later empirical studies of researchers’ notions of research quality. A study based on interviews with merited senior researchers in ten fields of research, explicated dimensions of research quality in line with Polanyi’s, namely solidity and originality, and split relevance/value into two: scientific as well as societal relevance/value (Gulbrandsen 2000; Gulbrandsen and Langfeldt 1997).

In a study of research grant application review, Lamont (2009) came up with more or less the same aspects in a list of key review criteria including: methods (another manifestation of plausibility), intellectual and/or social significance, and originality. Lee (2015: 1276), referring to Lamont (2009), briefly explained these three criteria as follows: “novelty promotes the discovery of new truths, methodological soundness assesses the likely truth of study conclusions by evaluating the reliability of data collection and analysis strategy, and determinations of significance tell us which novel truths are most interesting or important” (italics added).

The study of research quality has also been approached through a focus on norms. This has included perspectives from sociology and philosophy of science. For instance, Tranøy (1976, 1986) outlined general scientific norms, related to scientific methodology matters – like truth/probability, testability, coherence, simplicity/completeness, honesty, openness and impartiality/objectivity – as well as originality and relevance/fruitfulness/value (Tranøy 1986: 144ff). Merton approached the issue of research quality through formulating the social imperatives (norms) of science as a social system: communism (openness), universalism (impersonal criteria and reproducibility of results), disinterestedness (impartiality and imperviousness to interests exogenous to science) and organised scepticism (scrutiny and thoroughness) (Merton 1942/1973: 269). Merton also argued originality is one of the institutional norms of science (Merton 1957/1973: 293).

Here it suffices to note that Merton and Tranøy, whilst using conventional perspectives of sociology and philosophy of science, converged on similar norms. Empirical studies of researchers’ research quality notions (Hemlin 1991; Gulbrandsen 2000; Lamont 2009) have also found similar attributes. We now explore these three key attributes below, as part of the second component dimension of our framework.

Originality/Novelty

Originality or novelty refers primarily to providing new knowledge and innovative research. These are key attributes for scientific knowledge to become a legitimate contribution to the knowledge pools of research fields. Still, according to the literature, there are multiple ways in which research can be original (Lamont 2009: 171–174; Gulbrandsen and Langfeldt 1997: 87).

Lamont (2009) and Hemlin (1991) find that originality relates to different aspects of research, such as the research ideas, topics, approaches, theories, data, methods, or the outcomes/findings. Originality may be incremental or radical and there may be different views on whether radical originality is desirable and/or acceptable, and notions vary between fields of research (Gulbrandsen 2000: 116). Generally, originality is often linked to curiosity and creativity as beneficial properties of the researchers and/or the research environment (Bazeley 2010; Lamont et al. 2007; Gulbrandsen 2000: 138).

Plausibility/Reliability

Empirical studies of researchers’ notions of research quality have identified a number of notions around plausibility, or reliability, of research. These include correctness, rigor, sound methods, thoroughness and clarity, as well as research integrity and ethics. Different fields emphasise different kinds of reliability as the important ones (Lamont 2009: 167; Gulbrandsen 2000: 115; Hemlin 1991). Gulbrandsen found experimental fields are concerned with replicability, whereas engineers sometimes consider successful industrial implementation an important indicator of reliable research. In the humanities, researchers emphasise the importance of thorough arguments, whereas economists value well-specified models, consistency and testability (Gulbrandsen 2000: 114–115).

Lamont found clarity, rigor, methodological soundness and craftsmanship to be important (Lamont 2009). In Swiss humanities, Hug et al. identified stringent argumentation, presentation of relevant documents and evidence, clear language, clear structure, reflection of method, and adherence to standards of scientific honesty, to be standards of plausibility and reliability (Hug et al. 2013: 374). In an Australian survey of science, social science and humanities, ‘methodologically sound’ was a primary descriptor of research performance (Bazeley 2010: 895).

Mårtensson et al. identified credibility as one of four main characteristics of research quality in a multidisciplinary context, with sub-characteristics such as rigorous, reliable, coherent and transparent (Mårtensson et al. 2016: 599). Another main characteristic was conforming, including research ethics and basic conditions for plausibility/reliability, like avoiding plagiarism and fraud, and preventing harmful social consequences. This provides a novel way to describe key dimensions of research quality, more aligned with contemporary policy emphases on open and responsible science.

Value/Usefulness

We can distinguish two aspects of the value, or usefulness, of research: its scientific value and its value outside science. Scientific value/usefulness concerns how research progresses a research field and advances scholarly debate. Societal value/usefulness addresses multiple social domains and time horizons, e.g. environment, welfare, health, economy, equity, technological development, cultural heritage.

Lamont (2009) found research evaluation panels concerned with intellectual significance as well as the political and social importance of research topics. Impact on academia, knowledge and the field, as well as politically and socially were important (Lamont 2009: 175). For humanities, Hug et al. identified scholarly exchange, connecting to other research, impact on research community and for future research as consensus standards (Hug et al. 2013: 374–175). Researchers in fields oriented towards practical applications tend to place more emphasis on societal relevance, e.g. in engineering sciences and clinical medicine (Gulbrandsen and Langfeldt 1997; Hemlin 1991).

Overall, for this second dimension we find that ‘research quality’ is a distributed notion referring to multiple attributes. Decomposing the notion into attributes such as originality, plausibility and value, may reveal significant differences in what is understood as ‘good’ research in varied research fields and organisational contexts. From the empirical studies referred to above, we see that the various attributes have quite different contents in e.g. humanities and engineering. It also raises questions regarding inherent social dynamics in establishing dominant notions of research quality and (potential) tensions this may entail. This context dependency leads us to the third dimension of our framework: organisational sites.

Third Dimension: Organisational Sites

The third dimension of our framework addresses and identifies organisational contexts where we believe dominant notions of research quality are established and institutionalised. Research quality, as we have argued, is a multi-dimensional and context-dependent notion. To unpack the social processes through which dominant notions of research quality are established we must understand the organisational contexts where they occur and interactions between them.

Building on understanding the science system as a set of authority relationships (Whitley 2011) between research spaces and research fields (Nedeva 2010, 2013), we identify five organisational sites where notions of research quality are constituted, negotiated and institutionalised. At each site we posit there will be a number of key concerns – relating to our previous two dimensions of ‘types’ and ‘attributes’ – as well as specific authority and institutionalisation characteristics. Table 2 provides an overview of the five sites and their characteristics that we then discuss in turn below.

Table 2 Organisational sites

Individuals and Research Groups

When assessing research quality, individual researchers use their scholarly ‘luggage’, resulting from e.g. socialisation during doctoral studies (Becher 1989: 25–27) and interaction with scholarly networks in their field, as well as influences from outside the field. Their key concerns can be expected to be intrinsic to science, with F(ield)-type notions. Still, as noted above, in applied fields this may include strong emphasis on societal relevance (Gulbrandsen and Langfeldt 1997; Hemlin 1991). Within a specific research field or research group, notions may vary considerably between individual researchers and be dynamic. They may cluster around a fairly stable core of professional standards acquired through the early stages of socialisation in academe, but vary, and change, depending on the context of assessments (Lamont 2009). For example, the underlying quality notions when assessing a PhD thesis may be different from those used to review proposals for large research grants.

Knowledge Communities/Networks

For knowledge communities across their various networks, quality notions would be constituted, negotiated and signalled through selection practices, e.g. conferences, journals, seminars, workshops and academic training arrangements. Knowledge communities tend to influence (or control) perceived dominant approaches, theories and methods in their research field (Whitley et al. 2010), peer review being a decisive selection practice, and reflecting the key quality notions. The degree of codification (stability and explicitness) of these notions may vary, but we expect all research fields have notions of research quality underlying refereeing and reviewing criteria, to develop training programmes and to establish rules for professional conduct.

Research Organisations

Research organisations are where research quality is negotiated between researchers (using quality notions from the ‘research field’) and organisational elites (translating policy pressures from the policy and funding-related ‘research space’). The institutionalisation processes for quality notions here would therefore include such aspects as criteria to recruit staff, the academic career system and allocation of research resources. Research organisations can attempt to affect research lines – individual and collective (Gläser 2016) – and send signals back to policymakers that can potentially transform funding/policy notions of research quality.

Research Funding Agencies

Funding agencies operationalise ‘research quality’ through their governance mechanisms (Hellström 2011; Borlaug 2015; Kuhlmann and Rip 2014). These include applying peer review to assess research proposals and to allocate funds. Here research quality is a ‘boundary object’ (Star and Griesemer 1989). Funding agencies provide a space for interactions between policy and research communities, and are arenas for constant negotiations of quality notions (Rip 1994; van der Meulen 1998; Jasanoff 1990). Consequently, funding agencies are – like research organisations – sites where S-type and F-type research quality notions co-exist, interact and are negotiated. They have quite different responsibilities – and purposes for assessing research/researchers – than research organisations and may set the terms for competition between research fields and topics far more explicitly.

Regional/National Policy

The policy site is not constrained to just ‘science policy’ but includes other essential relationships between research organisations and their regional (or indeed trans-national) and national funding/policy environment. Quality notions may here involve ‘system’ properties, such as functional institutions for resource allocation, procedures for priority setting, efficient collaboration between parts of the research and innovation system, and, not least, an intention of the system to assure performance by individual researchers and/or research organisations.

Overall Framework to Study Research Quality Notions

We now present our proposed overall framework to study research quality notions. Table 3 incorporates the three dimensions of the framework. For simplicity, dimension 1, F/S type notions, includes two key components: research quality judgement anchor and quality standard (substantive or proxy-based). For dimensions 2 (research quality attributes) and 3 (the organisational sites) all components are included.

Table 3 Overall framework: Research quality notions (dimension 1: anchors and standards), attributes (dimension 2) and organisational sites (dimension 3)

This framework holds three potential analytical strengths. First, it enables us to formulate expectations about quality notions. Second, it helps us to explicate differences in meaning depending on organisational contexts (e.g. to highlight that ‘usefulness’ may mean very different things within knowledge communities as compared to policy sites). Third, it allows us to unpack possible tensions that might be developing in certain organisational contexts, for example, because of some specific interplay between F-type and S-type notions over time in particular fields, organisations, or national funding/policy settings.

Below, we address important points under each of the organisational sites.

Individual Researchers/Groups and Quality Notions

Notions of research quality become manifest at the level of researchers, individually and in groups, but may be hard to pinpoint because they are rarely codified. Researchers’ quality notions can be tacit, and derived from comparison (e.g. to peers, to past and present research). They may also be highly context-dependent and dynamic. Individual perceptions of quality, and criteria for judgement, can evolve and norms can shift within short time spans. For example, when individual researchers are asked to serve on panels to assess research proposals the norm(s) against which they judge ‘excellence’ (or quality) may change for different batches of proposals (Lamont 2009). Assessments of research ‘involves the making of a number of subtle, indeed tacit judgements’, and criteria are too specialised and science too rapidly changing for formal categories of research quality to be established (Ravetz 1971: 274).

Nevertheless, researchers are a useful empirical entry point to unpack issues around research quality and tensions that might arise when differing quality notions are applied and contested. Here we should keep in mind that notions are also personal, multi-dimensional and relate to the (sub)field of the researcher. Typically, a key aim of researchers is to advance knowledge in their specific field so their quality standards can be expected to be anchored in the substance of their personal knowledge pool and agenda, and address robustness as well as novelty and value of the research topics, issues and problems upon which they work.

Knowledge Communities and Quality Notions

Research fields include knowledge communities, such as journals and conferences that are important in (re)defining the field. Here quality notions are anchored in the collective knowledge pool of the field (when addressing originality/novelty and value/usefulness), and the dominant theory, approaches and methodology (when addressing plausibility/reliability).

Knowledge communities are key sites for the constitution of quality notions, but there appears to be little empirical research on how criteria and standards of knowledge communities are formed. The literature indicates that in assessing manuscripts submitted to scientific journals, importance and relevance for the audience of the particular journal are key criteria – and are thus very context/field-specific – and reviews sometimes fail to detect basic weaknesses in solidity of data, methods and analysis (Lee 2015). In other words, there is a possible tension to explore, for instance, in (some) knowledge communities when peer review might favour originality/novelty over plausibility/reliability.Footnote 4

At this site, assessments and formulation of criteria for ‘good research’ are ideally part of the ‘dynamic and critical self-reflection of the scientific community’ (Niiniluoto 1987: 22) and aim at advancement of knowledge. However, literature on journal peer review are often concerned with reviewer disagreement and bias (Weller 2001; Johnson and Hermanowicz 2017).Footnote 5

Notably, the scope and focus of this site are genuinely different from the broader and policy-involved sites (research organisations, funding agencies, policy spaces, as discussed below). Assessment of single research works can address their value for the research goals of a specific field or research topic (and at a specific point in time). They are normally not intended for research policy purposes or for comparing/measuring research quality across fields.Footnote 6

Research Organisations and Quality Notions

Research organisations are sites where F-type and S-type research quality notions most obviously meet (and quite possibly collide). Diverse F-type research quality notions are likely to permeate any given research organisation. At the same time, S-type quality notions can enter research organisations through increasing professionalization of management and leadership (Gornitzka and Larsen 2004; Sauder and Espeland 2009; Elken and Røsdal 2017).

Empirical study can also show the interplay of F/S types notions – for instance, the assessment of candidates for academic positions may involve the assessments of the productivity and international position of the candidates (Hemlin 1991). In our terms these would be more S-type quality notions, reflecting concerns when assessing individual researchers in this institutional context, beyond simply a researcher’s contribution to a specific research field (F-type notions).

There may be differences between research organisations in the range of tensions and conflicts experienced in negotiating quality notions. This can vary by, e.g. level of strategic and operational autonomy of the organisation, balance of block grants and expectations of return on investment and performance (Gläser 2016). Research organisations host and allocate resources internally to a multitude of research fields. Fields may have very different (and conflicting) notions of value/usefulness, plausibility/reliability and originality/novelty, and choice of criteria (and proxies) can have a large impact on resource allocations and impact various local quality notions (e.g. Laudel and Weyer 2014). These issues might play out about the amenability of conditions for, and even the continued existence of specific research fields within a research organisation, and tensions may be acute. Hence, when access to resources is limited, F-type quality notions may be contested by representatives of different knowledge communities.Footnote 7 When resources are plentiful, different F-type quality notions may more easily co-exist.

Research Funding Agencies and Notions of Research Quality

As intermediate bodies, research funding agencies are expected to mediate and negotiate research quality notions. They are likely to embody both F- and S-type notions. Tensions experienced in trying to absorb such different notions may depend on the agency’s structural position in its regional/national research space. If it is an executive agency of government, tensions could be serious; if it is part of a ‘republic of science’, tensions might be minimal.

An empirical illustration here from recent decades is negotiations between F-type and S-type notions expressed through funding agency adopted terms of ‘societal impact’ and ‘scientific excellence’. Numerous funding instruments have focused on potential societal impacts of research, and on funding ‘excellent’ research and research with highest potential for scientific breakthroughs (OECD 2014; Aksnes et al. 2012; Frodeman and Briggle 2012; Heinze 2008). There are also potential tensions not only in negotiating F/S-type notions and criteria setting the conditions for grants competition, but also at the micro-level. Some studies have for instance found biases and unwanted dynamics in grant panel decision-making (Arensbergen et al. 2014; Langfeldt 2001). And whilst researchers “tend by default to focus on scientific criteria in their judgements” (Nightingale and Scott 2007: 551), funding agencies may want to steer them to comply with S-type policy aims.Footnote 8

There are multiple empirical entry points to study quality notions of research funding agencies. One entry point could be the kinds of stakeholders allowed to define programme objectives and criteria, the profile and objectives of its funding schemes, and its review guidelines and/or selection criteria. There are also micro-level activities, such as the rules and procedures agencies use to appoint researchers to grant proposal review panels, and to select reviewers more generally. At both levels important decisions are taken regarding who is to mediate and resolve differing research quality notions.

National/Regional Policy and Quality Notions

Negotiation of quality notions at policy sites is a complex process, involving various, and differing interest groups. Some tensions to explore here would be any that emerge between policy elites, research organisational elites and research field elites. Overall concerns might relate to the value/usefulness of science for society and how to allocate public funding. Judgements of quality attributes (originality/novelty, plausibility/reliability and value/usefulness) can be anchored in concerns evidenced by white papers and other policy documents. Notions of quality in this context are also usually embodied in evaluation regimes and research evaluation systems.

Quality criteria and standards used in evaluation regimes might discriminate little between different research fields. Similarly, because some quality notions in policy arenas are developed and used by lay groups these could use proxies, e.g. perceptions of the quality of journals as a proxy for the quality of research papers (Seglen 1997; Adler and Harzing 2009; Rafols et al. 2012; Nedeva et al. 2012).

A crucial point here is that notions of research quality are most often enforced through evaluation regimes. Hence to unpack the notions of research quality at this organisational site we may address: the ideology of the policy/funding research space (i.e. all the explicit and implicit assumptions of the value of science in society and how to support it); research funding modalities and flows; resource distribution; evaluation regimes and performance management approaches (e.g. the quality notions underpinning performance-based funding).

Discussion

In this paper we revisit the notions and understanding of research quality and elaborate a novel framework for its study. As outlined in the second section, research quality notions and concerns regarding the quality of research can be traced back to the beginnings of modern science. Still, the concept of ‘quality’ itself was not in much use until relatively recently. A discourse on research quality appeared gradually, alongside other major elements of the modern constitution of research. One such element was the emergence of a competitive and pluralistic system of funding with (mostly) public sources. This demanded transparency, fairness and easy to use criteria, and expanded quality concerns beyond specific research fields.

Another element was work in the sociology of science repeatedly demonstrating that scientific inquiry yields highly differentiated results. Some research generated more interesting and useful results that were widely circulated and cited, and influenced other research more. This brought about a perception of distinctions between leaders and followers, ‘metropolis’ and ‘province’, and centres and peripheries, all linked to differences in research quality (Shils 1961a/1972; 1961b/1975; 1988). In turn, this led to a hierarchical understanding of the organisation of science, whereby some organisations, and indeed individuals, constituted the ‘elite’ again in an assumed relation to quality (Zuckerman 1977).

Hence, the sheer growth of the research system and the need for criteria to distribute funds and allocate prestige across this rapidly growing system propelled a demand to articulate an idea of research quality in the immediate post-WWII decades. This explains the surge of definitions in the early generation of seminal work in the philosophy and sociology of science by authors such as Merton, Polanyi, Ben-David, and others. However, these thinkers still operated very much in a classical paradigm and did not use a concept of quality much themselves.Footnote 9 Notably, Merton talked about ‘norms’ or ‘imperatives’. Polanyi used the term ‘merit’. The terms suggest that what they had in mind was an understanding of research quality that was essentially rooted in method, appropriate procedure and virtuous application.

In the following decades, the conditions for a discussion of research quality changed fundamentally. The use of the concept of ‘quality’ started to grow rapidly in the 1980s and 1990s when also its meaning widened to encompass quality processes and quality management under influence from private industry and New Public Management. That was also when indicators first came into more widespread use and even in early attempts to analyse their emergence and application there was an awareness that they sometimes reflected a drift, or dilution, of research quality (van Raan et al. 1989).

It is now apparent that ‘research’ is a very diverse activity taking place in and across an equally diverse set of nation states, cultures, and organisations. New notions of research quality have emerged and supplement those of the 20th century foundational thinkers of Western sociology and philosophy of science. However, these notions have not been very well articulated and, above all, the growing diversity of notions of research quality has lacked a meaningful framework that can link the institutional conditions and diversity to the empirical manifestations of quality and its criteria.

To build such a framework also means a new theory of research quality, better suited to explain and organise the plethora of quality definitions that are currently in circulation. The backbone of a theory of research quality is the increased diversity of research itself. The classical notion was predicated on the discipline as the singular, exclusive road to in-depth scientific knowledge. Empirical work on how science is conducted suggests that the discipline itself is increasingly becoming a phenomenon of the past (Weingart and Stehr 2000). Most of the classical disciplines have become so large that their internal diversity must allow a very large spread of methodological approaches that make distinct definitions of quality within a disciplinary culture hard to uphold. In addition, hybrid areas develop and there seems to be less of a concern among the researchers themselves to articulate their disciplinary homes. Funding agencies and policies have clearly stimulated the growth of such hybrid areas over the last few decades and notions of quality are correspondingly adrift.

These, along with the sheer growth of the societal research enterprise and its multiple mission orientations has created a need for more pluralistic approaches and a new theorizing of research quality. This is ultimately a discussion about the concept of ‘research’. A narrow definition of the concept is more compatible with the classical understanding of research quality, what we have called F-type notions, rooted in the dynamics of research fields and disciplinary cultures. A wider definition sits more comfortably with S-type notions, linked closer to policy and societal applications.

Conclusion

In this paper we distinguish between two types of quality notions – F(ield)-type and S(pace)-type – and use these to elaborate a framework for the study and understanding of research quality. We outline three (potentially conflicting) attributes of research quality notions and the organisational sites where the notions emerge and get contested and institutionalised.

In short, the framework provides: a) empirical entry points and access through research fields; b) wider empirical coverage for analytical comparison; and c) an overall structure for information collection and analysis. The attributes of research quality derived from the literature – originality, plausibility, and value – serve as analytical devices, helping to understand differences in emphases between research fields, funding and policy spaces and organisational sites (e.g. research journals and conferences, research organisations, funding agencies etc.). Key aims of such studies would be to understand the role and interaction of F-type and S-type notions of research quality in defining good research, and in developing, and contesting, criteria and indicators. Furthermore, this begs a set of questions about the ways in which research quality criteria impact research practice and content.

Studying research quality notions implies trying to capture diverse and tacit notions which are expressed through context-dependent assessments on what projects are most worth funding, what papers are publishable, which researchers should be employed or promoted, or expressed in e.g. national evaluation regimes. Such formal assessments are triggered by the need to allocate resources, not to define quality as such, and conclusions result from the combination of the selected reviewers, the evaluation objects, and the organisational sites and their quality notions. In other words, the co-existence of quality notions also depends on the purpose of evaluation. However, we know little about how formal peer review and research evaluation interact with more general notions of research quality, or how notions are impacted by the increasing availability of quantitative indicators of research performance.

The framework has implications for the study and understanding of research quality and of how research fields organise themselves around research quality notions. It re-focuses attention to include: research on the social processes through which dominant notions of quality are established and institutionalised; research on the organisational and institutional tensions that different notions of research quality generate in research organisations, research funding agencies and research evaluation regimes; comparative study of research quality notions specific to different research fields and the ways these ‘travel’ across research fields and to research spaces; study of assessment of individual researchers with regard to their research profile, or career stage, or whether they try to adapt to F- or S-type quality notions. Methodologically, applying the proposed framework implies that comparative studies of notions of research quality take research fields as entry points, and extend to national policy and funding spaces, rather than compare the assumptions of blanket, national evaluation and quality regimes. Last but not least, using this framework makes it possible to formulate expectations about tensions at different junctures of the science system and the ways in which these can be alleviated and resolved.

This framework also has two important practical applications. First, members of research fields can use it to understand, and if necessary change, the ways in which structures using quality notions are organised; e.g. all structures and arrangements demanding selection. Second, there are implications for policy in signalling the necessity for the development and implementation of more nuanced evaluation systems that account for the specific research quality attributes and notions in diverse research fields, and hedge against irresponsible, intended or unintended, use of proxies for research quality.

The primary strength and usefulness of our approach, we believe, is that it brings the concept of research quality in contact with the multiple domains of activity where research is taking place. We also believe that our framework may offer a navigation tool, both for scholars reflecting on the interrelationship of science and policy and for practitioners in policy, funding, and evaluation who have so far had little systematic and conceptual support in their work to identify and reward research quality. We hope that future work, by ourselves and others, will be able to go deeper in its empirical and operational manifestations in these domains. To investigate research quality empirically and theoretically is to a large extent a work that lies ahead.